Skip to main content
Log in

Variable selection and model validation of 2D and 3D molecular descriptors>

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

We have found that molecular shape and electrostatics, in conjunction with 2D structural fingerprints, are important variables in discriminating classes of active and inactive compounds. The subject of this paper is how to explore the selection of these variables and identify their relative importance in quantitative structure–activity relationships (QSAR) analysis. We show the use of these variables in a form of similarity searching with respect to a crystal structure of a known bound ligand. This analysis is then validated through k-fold cross-validation of enrichments via several common classifiers. Additionally, we show an effective methodology using the variables in hypothesis generation; namely, when the crystal structure of a bound ligand is not known.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Daylight Theory Manual, Daylight CIS Inc., Mission Viejo, CA, http://www.daylight.com.

  • Barnard Chemical Information Ltd., Sheffield, UK, http://www.bci.gb.com.

  • MDL Information Systems, Inc., San Leandro, CA, http://www.mdli.com.

  • BioByte, Claremont, CA, http://www.biobyte.com.

  • Edusoft, San Francisco, CA, http://www.edusoft.com.

  • R.D. Cramer M.A. Poss M.A. Hermsmeier T.J. Caulfield M.C. Kowala M.T. Valentine (1999) J. Med. Chem., 42 3919

    Google Scholar 

  • R.S. Pearlman K.M. Smith (1999) J. Chem. Inf. Comput. Sci., 39 28

    Google Scholar 

  • Greco, G., Novellino, E. and Martin, Y.C., In Lipkowitz, K.B. and Boyd, D.B. (Eds.), Reviews in Computational Chemistry, VCH Publishers, New York, NY, 1997, pp. 183–240.

  • R. Wang Y. Fu L. Lai (1997) J. Chem. Inf. Comput. Sci., 37 615

    Google Scholar 

  • J.L. Durant B.A. Leland D.R. Henry J.G. Nourse (2002) J. Chem. Inf. Comput. Sci., 42 1273

    Google Scholar 

  • B.B. Masek A. Merchant J.B. Mattheews (1993) Proteins 17 193

    Google Scholar 

  • J.C. Gower P. Legendre (1986) J. Classif. 3 5

    Google Scholar 

  • T.A. Halgren (1996) J. Comput. Chem., 17 490

    Google Scholar 

  • T.A. Halgren (1999) J. Comput. Chem., 20 720

    Google Scholar 

  • Gasteiger, J. and Marsili, M., Tetrahedron Lett., (1978) 3181.

  • Fingerprint Module, Mesa Analytics & Computing, LLC, Santa Fe, NM, http://www.mesaac.com.

  • OEChem–C++ Theory Manual, OpenEye Scientific Software, Santa Fe, NM, http://www.eyesopen.com.

  • A.R. Leach V.J. Gillet (2003) An Introduction to Chemoinformatics Kluwer Boston, MA

    Google Scholar 

  • M. Kubat R.C. Holte S. Matwin (1998) Mach. Learn., 30 195

    Google Scholar 

  • A.K. Jain R.C. Dubes (1988) Algorithms for Clustering Data Prentice Hall Englewood Cliffs, NJ

    Google Scholar 

  • J.A. Grant B.T. Pickup (1996) J. Comput. Chem., 17 1653

    Google Scholar 

  • ROCS, OpenEye Scientific Software, Santa Fe, NM, http://www.eyesopen.com.

  • D.C. Spellmeyer A.K. Wong M.J. Bower (1997) J. Mol. Graph. Mod. 15 18

    Google Scholar 

  • J. Boström (2001) J. Comput.-Aided Mol. Des., 15 1137

    Google Scholar 

  • H.M. Berman J. Westbrook Z. Feng G. Gilliland T.N. Bhat H. Weissig I.N. Shindyalov P.E. Bourne (2000) Nucleic Acids Res., 28 235

    Google Scholar 

  • Wombat Database, Sunset Molecular Discovery LLC, Santa Fe, NM, http://www.sunsetmolecular.com.

  • G.M. Downs J.M. Barnard K.B. Lipkowitz D.B. Boyd (Eds) (2002) Reviews in Computational Chemistry Wiley–VCH ew York, NY 1–40

    Google Scholar 

  • R. Taylor (1995) J. Chem. Inf. Comput. Sci., 35 59

    Google Scholar 

  • D. Butina (1999) J. Chem. Inf. Comput. Sci., 39 747

    Google Scholar 

  • MacCuish, N.E. and MacCuish, J.D., Chemometrics and Chemoinformatics, ACS Symposium Series, in press.

  • R. Tarjan (1983) Inf. Process. Lett., 17 37

    Google Scholar 

  • E. Fischer (1894) Ber. Dt. Chem. Ges., 27 2985

    Google Scholar 

  • G.E. Kellogg S. Phatak A. Nicholls A. Grant (2003) QSAR Comb. Sci., 22 959

    Google Scholar 

  • S.K. Kearsley G.M. Smith (1990) Tet. Comput. Met., 3 615

    Google Scholar 

  • A.C. Good E.E. Hodgkin W.G. Richards (1992) J. Chem. Inf. Comput. Sci., 32 188

    Google Scholar 

  • A.C. Good W.G. Richards (1993) J. Chem. Inf. Comput. Sci., 33 112

    Google Scholar 

  • A. Jaklian D.B. Jack C. Bayly (2002) J. Comput. Chem. 23 1623–1641

    Google Scholar 

  • Katz, A.H., Tawa, G.J., Mason, K., Gove, S. and Alvarez, J., In COMP92, 227th American Chemical Society National Meeting, Anaheim, CA, 2004.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anthony Nicholls.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nicholls, A., MacCuish, N.E. & MacCuish, J.D. Variable selection and model validation of 2D and 3D molecular descriptors>. J Comput Aided Mol Des 18, 451–474 (2004). https://doi.org/10.1007/s10822-004-5202-8

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-004-5202-8

Keywords

Navigation