Comparing predictive ability of QSAR/QSPR models using 2D and 3D molecular representations

Sato, Akinori; Miyao, Tomoyuki; Jasial, Swarit; Funatsu, Kimito

doi:10.1007/s10822-020-00361-7

Comparing predictive ability of QSAR/QSPR models using 2D and 3D molecular representations

Published: 04 January 2021

Volume 35, pages 179–193, (2021)
Cite this article

Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Akinori Sato¹,
Tomoyuki Miyao^1,2,
Swarit Jasial^1,2 &
…
Kimito Funatsu ORCID: orcid.org/0000-0002-9368-0302^2,3

1035 Accesses
12 Citations
1 Altmetric
Explore all metrics

Abstract

Quantitative structure–activity relationship (QSAR) and quantitative structure–property relationship (QSPR) models predict biological activity and molecular property based on the numerical relationship between chemical structures and activity (property) values. Molecular representations are of importance in QSAR/QSPR analysis. Topological information of molecular structures is usually utilized (2D representations) for this purpose. However, conformational information seems important because molecules are in the three-dimensional space. As a three-dimensional molecular representation applicable to diverse compounds, similarity between a test molecule and a set of reference molecules has been previously proposed. This 3D representation was found to be effective on virtual screening for early enrichment of active compounds. In this study, we introduced the 3D representation into QSAR/QSPR modeling (regression tasks). Furthermore, we investigated relative merits of 3D representations over 2D in terms of the diversity of training data sets. For the prediction task of quantum mechanics-based properties, the 3D representations were superior to 2D. For predicting activity of small molecules against specific biological targets, no consistent trend was observed in the difference of performance using the two types of representations, irrespective of the diversity of training data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

QSAR—An Important In-Silico Tool in Drug Design and Discovery

Quantitative structure–activity relationship (QSAR) studies as strategic approach in drug discovery

Article 26 June 2014

Harun M. Patel, Malleshappa N. Noolvi, … Varun Bhardwaj

Chemoinformatics and QSAR

References

Sippl W, Robaa D (2018) Applied chemoinformatics. Wiley-VCH Verlag GmbH & Co, KGaA, Weinheim
Google Scholar
Rodríguez-Pérez R, Miyao T, Jasial S, Vogt M, Bajorath J (2018) Prediction of compound profiling matrices using machine learning. ACS Omega 3:4713–4723
Article Google Scholar
Yuan Q, Wei Z, Guan X, Jiang M, Wang S, Zhang S, Li Z (2019) Toxicity prediction method based on multi-channel convolutional neural network. Molecules 24:3383
Article CAS Google Scholar
Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics. Methods and principles in medicinal chemistry. Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim
Book Google Scholar
Kuz’min VE, Polishchuk PG, Artemenko AG, Andronati SA (2011) Interpretation of QSAR models based on random forest methods. Mol Inf 30:593–603
Article Google Scholar
Rodríguez-Pérez R, Bajorath J (2019) Interpretation of compound activity predictions from complex machine learning models using local approximations and shapley values. J Med Chem. https://doi.org/10.1021/acs.jmedchem.9b01101
Article PubMed Google Scholar
Kearnes S, McCloskey K, Berndl M, Pande V, Riley P (2016) Molecular graph convolutions: moving beyond fingerprints. J Comput Aided Mol Des 30:595–608
Article CAS Google Scholar
Jo J, Kwak B, Choi HS, Yoon S (2020) The message passing neural networks for chemical property prediction on SMILES. Methods. https://doi.org/10.1016/j.ymeth.2020.05.009
Article PubMed Google Scholar
Cramer RD, Patterson DE, Bunce JD (1988) Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carrier proteins. J Am Chem Soc 110:5959–5967
Article CAS Google Scholar
Sato T, Yuki H, Takaya D, Sasaki S, Tanaka A, Honma T (2012) Application of support vector machine to three-dimensional shape-based virtual screening using comprehensive three-dimensional molecular shape overlay with known inhibitors. J Chem Inf Model 52:1015–1026
Article CAS Google Scholar
Hu B, Kuang ZK, Feng SY, Wang D, He SB, Kong DX, Hu B, Kuang ZK, Feng SY, Wang D et al (2016) Three-dimensional biologically relevant spectrum (BRS-3D): shape similarity profile based on PDB ligands as molecular descriptors. Molecules 21:1554
Article Google Scholar
ROCS version 3.2.2.2; OpenEye Scientific Software Inc, Santa Fe, NM.
Hawkins PCD, Skillman AG, Nicholls A (2007) Comparison of shape-matching and docking as virtual screening tools. J Med Chem 50:74–82
Article CAS Google Scholar
Miyao T, Jasial S, Bajorath J, Funatsu K (2019) Evaluation of different virtual screening strategies on the basis of compound sets with characteristic core distributions and dissimilarity relationships. J Comput Aided Mol Des 33:729–743
Article CAS Google Scholar
Hu G, Kuang G, Xiao W, Li W, Liu G, Tang Y (2012) Performance evaluation of 2D fingerprint and 3D shape similarity methods in virtual screening. J Chem Inf Model 52:1103–1113
Article CAS Google Scholar
Bento AP, Gaulton A, Hersey A, Bellis LJ, Chambers J, Davies M, Kruger FA, Light Y, Mak L, McGlinchey S, Nowotka M, Papadatos G, Santos R, Overington JP (2014) The ChEMBL bioactivity database: an update. Nucleic Acids Res 42:1083–1090
Article Google Scholar
Naveja JJ, Vogt M, Stumpfe D, Medina-Franceo JL, Bajorath J (2019) Systematic extraction of analogue series from large compound collections using a new computational compound-core relationship method. ACS Omega 4:1027–1032
Article CAS Google Scholar
Wu Z, Ramsundar B, Feinberg EN, Gomes J, Geniesse C, Pappu AS, Leswing K, Pande V (2018) MoleculeNet: a benchmark for molecular machine learning. Chem Sci 9:513
Article CAS Google Scholar
Ramakrishnan R, Hartmann M, Tapavicza E, Lillienfield OAV (2015) Electronic spectra from TDDFT and machine learning in chemical space. J Chem Phys 143:084111
Article Google Scholar
Experimental in vitro DMPK and physicochemical data on a set of publicly disclosed compounds.
Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754
Article CAS Google Scholar
OEChem TK Version 2.3.0; OpeneEye Scientific Software Inc, Santa, Fe, NM
Molecular Operating Environment (MOE) 2019.01; Chemical Computing Group ULC: 1010 Sherbooke St West Suite #910 Montreal QC Canada H3A 2R7
OEOmega TK Version 2.9.1; OpenEye Scientific Software Inc. Santa Fe, NM
Hornik K (1991) Approximation capabilities of multilayer feed forward networks. Neural Netw 4:251–257
Article Google Scholar
Svetnik V, Liaw A, Tong C, Culberson JC, Sheridan RP, Feuston BP (2003) Random forest: a classification and regression tool for compound classification and QSAR modeling. J Chem Inf Comput Sci 42:1947–1958
Article Google Scholar
Wold S, Sjostrom M, Eriksson L (2001) PLS-regression: a basic tool of chemometrics. Chemometr Intell Lab Syst 58:109–130
Article CAS Google Scholar
Drucker H, Burges CJC, Kaufman L, Smola AJ, Vapnik V (1996) Support vector regression machines. Neural Inf Process Syst 9:155–161
Google Scholar
Pytorch Version 1.5.0
Optuna Version 1.3.0
Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. ICML 10:807–814
Google Scholar
Chen CH, Tanaka K, Funatsu K (2018) Random forest approach to QSPR study of fluorescence properties combining quantum chemical descriptors and solvent conditions. J Fluoresc 2:695–706
Article Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Google Scholar
Scipy Version 1.5.0
Irwin JJ, Serling T, Mysinger MM, Bolstad ES, Coleman RG (2012) ZINC: a free tool to discover chemistry for biology. J Chem Inf Model 52:1757–1768
Article CAS Google Scholar

Download references

Acknowledgements

We thank OpenEye Scientific Software Inc., for providing a free academic license of the OpenEye chemistry toolkits. This work was supported by JSPS KAKENHI Grant Number JP20K19922.

Author information

Authors and Affiliations

Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan
Akinori Sato, Tomoyuki Miyao & Swarit Jasial
Data Science Center, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan
Tomoyuki Miyao, Swarit Jasial & Kimito Funatsu
Department of Chemical System Engineering, School of Engineering, The University of Tokyo, 7-3-1 Hongo. Bunkyo-ku, Tokyo, 113-8656, Japan
Kimito Funatsu

Authors

Akinori Sato
View author publications
You can also search for this author in PubMed Google Scholar
Tomoyuki Miyao
View author publications
You can also search for this author in PubMed Google Scholar
Swarit Jasial
View author publications
You can also search for this author in PubMed Google Scholar
Kimito Funatsu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kimito Funatsu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Electronic supplementary material 1 (CSV 275 kb)

Electronic supplementary material 2 (CSV 413 kb)

Electronic supplementary material 3 (DOCX 41 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sato, A., Miyao, T., Jasial, S. et al. Comparing predictive ability of QSAR/QSPR models using 2D and 3D molecular representations. J Comput Aided Mol Des 35, 179–193 (2021). https://doi.org/10.1007/s10822-020-00361-7

Download citation

Received: 05 July 2020
Accepted: 12 November 2020
Published: 04 January 2021
Issue Date: February 2021
DOI: https://doi.org/10.1007/s10822-020-00361-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparing predictive ability of QSAR/QSPR models using 2D and 3D molecular representations

Abstract

Access this article

Similar content being viewed by others

QSAR—An Important In-Silico Tool in Drug Design and Discovery

Quantitative structure–activity relationship (QSAR) studies as strategic approach in drug discovery

Chemoinformatics and QSAR

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Electronic supplementary material 1 (CSV 275 kb)

Electronic supplementary material 2 (CSV 413 kb)

Electronic supplementary material 3 (DOCX 41 kb)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Comparing predictive ability of QSAR/QSPR models using 2D and 3D molecular representations

Abstract

Access this article

Similar content being viewed by others

QSAR—An Important In-Silico Tool in Drug Design and Discovery

Quantitative structure–activity relationship (QSAR) studies as strategic approach in drug discovery

Chemoinformatics and QSAR

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Electronic supplementary material

Electronic supplementary material 1 (CSV 275 kb)

Electronic supplementary material 2 (CSV 413 kb)

Electronic supplementary material 3 (DOCX 41 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation