Skip to main content
Log in

The great descriptor melting pot: mixing descriptors for the common good of QSAR models

  • Perspective
  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

The usefulness and utility of QSAR modeling depends heavily on the ability to estimate the values of molecular descriptors relevant to the endpoints of interest followed by an optimized selection of descriptors to form the best QSAR models from a representative set of the endpoints of interest. The performance of a QSAR model is directly related to its molecular descriptors. QSAR modeling, specifically model construction and optimization, has benefited from its ability to borrow from other unrelated fields, yet the molecular descriptors that form QSAR models have remained basically unchanged in both form and preferred usage. There are many types of endpoints that require multiple classes of descriptors (descriptors that encode 1D through multi-dimensional, 4D and above, content) needed to most fully capture the molecular features and interactions that contribute to the endpoint. The advantages of QSAR models constructed from multiple, and different, descriptor classes have been demonstrated in the exploration of markedly different, and principally biological systems and endpoints. Multiple examples of such QSAR applications using different descriptor sets are described and that examined. The take-home-message is that a major part of the future of QSAR analysis, and its application to modeling biological potency, ADME-Tox properties, general use in virtual screening applications, as well as its expanding use into new fields for building QSPR models, lies in developing strategies that combine and use 1D through nD molecular descriptors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Hansch C, Fujita T (1964) ρ-σ-π analysis. A method for the correlation of biological activity and chemical structure. J Am Chem Soc 86:1616–1626

    Article  CAS  Google Scholar 

  2. Hansch C, Lien EJ (1968) An analysis of the structure-activity relationship in the adrenergic blocking activity of the β-haloalkylamines. Biochem Pharmacol 17:709–720

    Article  CAS  Google Scholar 

  3. Hansch C, Mahoney PP, Pujita T, Muir RM (1962) Correlation of biological activity of phenoxyacetic acids with hammett substituent constants and partition coefficients. Nature 194:178–180

    Article  CAS  Google Scholar 

  4. Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J Educ Psychol 24: 417–441 and 498–520

    Google Scholar 

  5. Wold S, Sjöström M (1998) Chemometrics, present and future success. Chemom Intell Lab Syst 44:3–14

    Article  CAS  Google Scholar 

  6. Wold S, Sjöström M, Ericksson L (1998) Partial least squares projections to latent structures (PLS) in chemistry. In: von Ragué Schleyer P (ed) Encyclopedia of computational chemistry vol. 3. John Wiley & Sons, Chichester, pp 2006–2021

    Google Scholar 

  7. Müller K-R, Mika S, Rätsch G, Tsuda K, Schölkopf B (2001) An introduction to kernel-based learning algorithms. IEEE Transac Neural Netw 12(2):181–201

    Article  Google Scholar 

  8. So S-S, Karplus M (1996) Genetic neural networks for quantitative structure-activity relationships: improvements and application of benzodiazepine affinity for benzodiazepine/GABAA receptors. J Med Chem 39:5246–5256

    Article  CAS  Google Scholar 

  9. Zupan J, Gasteiger J (1999) Neural networks in chemistry and drug design. Wiley-VCH, Weinheim

    Google Scholar 

  10. Holland JH (1975) Adaptation in artificial and natural systems. University of Michigan, Ann Arbor

    Google Scholar 

  11. Rogers D, Hopfinger AJ (1994) Application of genetic function approximation to quantitative structure-activity relationships and quantitative structure-property relationships. J Chem Inf Comput Sci 34(4):854–866

    Article  CAS  Google Scholar 

  12. Vapnik VN (1998) Statistical learning theory. Wiley, New York, p 736

    Google Scholar 

  13. Vapnik VN (2000) The Nature of statistical learning theory. Springer, New York, p 314

    Google Scholar 

  14. Pastor M, Cruciani G, McLay I, Pickett S, Clementi S (2000) GRid-INdependent descriptors (GRIND): a novel class of alignment-independent three-dimensional molecular descriptors. J Med Chem 43(17):3233–3243

    Article  CAS  Google Scholar 

  15. Cruciani G, Crivori P, Carrupt P, Testa B (2000) Molecular fields in quantitative structure–permeation relationships: the VolSurf approach. J Mol Struct: THEOCHEM 503(1–2):17–30

    Article  CAS  Google Scholar 

  16. Cruciani G, Pastor M, Guba W (2000) VolSurf: a new tool for the pharmacokonetic optimization of lead compounds. Eur J Pharm Sci 11:S29–S39

    Article  CAS  Google Scholar 

  17. Cruciani G, Pastor M, Mannhold R (2002) Suitability of molecular descriptors for database mining. A comparative analysis. J Med Chem 45(13):2685–2694

    Article  CAS  Google Scholar 

  18. Kulkarni AS, Hopfinger AJ (1999) Membrane-interaction QSAR analysis: application to the estimation of eye irritation by organic compounds. Pharm Res 16:1244–1252

    Article  Google Scholar 

  19. Hopfinger AJ, Reaka A, Venkatarangan P, Duca JS, Wang S (1999) Construction of a virtual high throughput screen by 4D-QSAR analysis: application to a combinatorial library of glucose inhibitors of glycogen phosphorylase b. J Chem Inf Comput Sci 39(6):1151–1160

    Article  CAS  Google Scholar 

  20. Hopfinger AJ, Wang S, Tokarski JS, Jin B, Albuquerque M, Madhav PJ, Duraiswami C (1997) Construction of 3D-QSAR models using the 4D-QSAR analysis formalism. J Am Chem Soc 119(43):10509–10524

    Article  CAS  Google Scholar 

  21. Klein CDP, Hopfinger AJ (1998) Pharmacological activity and membrane interactions of antiarrhythmics: 4D-QSAR/QSPR analysis. Pharm Res 15(2):303–311

    Article  CAS  Google Scholar 

  22. Krasowski MD, Hong X, Hopfinger AJ, Harrison NL (2002) 4D-QSAR analysis of a set of propofol analogues: mapping binding sites for an anesthetic phenol on the GABAA receptor. J Med Chem 45(15):3210–3221

    Article  CAS  Google Scholar 

  23. Santos-Filho OA, Hopfinger AJ (2001) A search for sources of drug resistance by the 4D-QSAR analysis of a set of antimalarial dihydrofolate reductase inhibitors. J Comput Aided Mol Des 15(1):1–12

    Article  CAS  Google Scholar 

  24. Senese CL, Duca J, Pan D, Hopfinger AJ, Tseng YJ (2004) 4D-fingerprints, universal QSAR and QSPR descriptors. J Chem Inf Comput Sci 44(5):1526–1539

    Article  CAS  Google Scholar 

  25. Spartan, Wavefunction, Inc. 18401 Von Karman Avenue, Suite 370, Irvine, CA 92612 USA, Version ‘10, http://www.wavefun.com/

  26. Schmidt MW, Baldridge KK, Boatz JA, Elbert ST, Gordon MS, Jensen JH, Koseki S, Matsunaga N, Nguyen KA, Su S, Windus TL, Dupuis M, Jr JAM (1993) General atomic and molecular electronic structure system. J Comput Chem 14(11):1347–1363

    Article  CAS  Google Scholar 

  27. CODESSA Semichem Inc., 12456 W 62nd Terrace, Suite D, Shawnee, Kansas 66216 USA, http://www.semichem.com/codessa/

  28. Molecular Operating Environment (MOE), Chemical Computing Group, Inc., 1010 Sherbrooke St. W, Suite 910, Montreal, Quebec, Canada H3A 2R7, http://www.chemcomp.com

  29. Steinbeck C, Han Y, Kuhn S, Horlacher O, Luttmann E, Willighagen E (2003) The chemistry development kit (CDK): an open-source java library for chemo- and bioinformatics. J Chem Inf Comput Sci 43(2):493–500

    Article  CAS  Google Scholar 

  30. Steinbeck C, Hoppe C, Kuhn S, Flores M, Guha R, Willighagen E (2006) Recent developments of the chemistry development kit (CDK)—an open-source java library for chemo- and bioinformatics. Curr Pharm Des 12(17):2111–2120

    Article  CAS  Google Scholar 

  31. Dragon TALETE srl, Via V. Pisani, 13–20124 Milano–Italy, http://www.talete.mi.it/products/dragon_description.htm

  32. Tetko IV, Gasteiger J, Todeschini R, Mauri A, Livingstone D, Ertl P, Palyulin VA, Radchenko EV, Zefirov NS, Makarenko AS, Tanchuk VY, Prokopenko VV (2005) Virtual computational chemistry laboratory–design and description. J Comput Aided Mol Des 19(6):453–463

    Article  CAS  Google Scholar 

  33. Molconn, Hall Associates Consulting, 2 Davis Street, Quincy, Massachusetts 02170 USA, http://www.molconn.com

  34. Pipeline Pilot, Accelrys, Inc., 10188 Telesis Court, Suite 100, San Diego, CA 92121, USA, http://accelrys.com/products/pipeline-pilot/

  35. SYBYL-X, Tripos Inc., 1699 South Hanley Road, Saint Louis, Missouri 63144, USA, http://www.tripos.com

  36. Shen M-y, B-H Su, Esposito EX, Hopfinger AJ, Tseng YJ (2011) A comprehensive SVM binary hERG classification model based on extensive but biased endpoint hERG data sets. Chem Res Toxicol 24(6):934–949

    Article  CAS  Google Scholar 

  37. Su B-H, Shen M-y, Esposito EX, Hopfinger AJ, Tseng YJ (2010) In silico binary classification QSAR models based on 4D-fingerprints and MOE descriptors for prediction of hERG blockage. J Chem Inf Model 50(7):1304–1318

    Article  CAS  Google Scholar 

  38. Santos-Filho OA, Hopfinger AJ, Zheng T (2004) Characterization of skin penetration processes of organic molecules using molecular similarity and QSAR analysis. Molecular Pharmaceutics 1(6):466–476

    Article  CAS  Google Scholar 

  39. Cramer RD III, Patterson DE, Bunce JD (1988) Comparative molecular field analysis (CoMFA). 1 Effect of shape on binding of steriods to carrier proteins. J Am Chem Soc 110(18):5959–5967

    Article  CAS  Google Scholar 

  40. Ravi M, Hopfinger AJ, Hormann RE, Dinan L (2001) 4D-QSAR analysis of a set of ecdysteroids and a comparison to CoMFA Modeling. J Chem Inf Comput Sci 41(6):1587–1604

    Article  CAS  Google Scholar 

  41. Ferreira AM, Krishnamurthy M, Moore BM II, Finkelstein D, Bashford D (2009) Quantitative structure–activity relationship (QSAR) for a series of novel cannabinoid derivatives using descriptors derived from semi-empirical quantum-chemical calculations. Bioorg Med Chem 17(6):2598–2606

    Article  CAS  Google Scholar 

  42. Klebe G, Abraham U, Mietzner T (1994) Molecular similarity indices in a comparative analysis (CoMSIA) of drug molecules to correlate and predict their biological activity. J Med Chem 37:4130–4146

    Article  CAS  Google Scholar 

  43. Iyer M, Zheng T, Hopfinger AJ, Tseng YJ (2007) QSAR analyses of skin penetration enhancers. J Chem Inf Model 47(3):1130–1149

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Emilio Xavier Esposito.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tseng, Y.J., Hopfinger, A.J. & Esposito, E.X. The great descriptor melting pot: mixing descriptors for the common good of QSAR models. J Comput Aided Mol Des 26, 39–43 (2012). https://doi.org/10.1007/s10822-011-9511-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-011-9511-4

Keywords

Navigation