Skip to main content
Log in

Inflation of correlation in the pursuit of drug-likeness

  • Perspective
  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript


Drug-likeness is a frequently invoked, although not always precisely defined, concept in drug discovery. Opinions on drug-likeness are to a large extent shaped by the relationships that are observed between surrogate measures of drug-likeness (e.g. aqueous solubility; permeability; pharmacological promiscuity) and fundamental physicochemical properties (e.g. lipophilicity; molecular size). This article draws on examples from the literature to highlight approaches to data analysis that exaggerate trends in data and the term correlation inflation is introduced in the context of drug discovery. Averaging groups of data points prior to analysis is a common cause of correlation inflation and results from analysis of binned continuous data should always be treated with caution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others


  1. Ziliak ST, McCloskey DN (2008) The cult of statistical significance: How the standard error costs us jobs, justice and lives. University of Michigan Press, Ann Arbor

    Google Scholar 

  2. Kelley K, Preacher KJ (2012) On effect size. Psychol Methods 17:137–152

    Article  Google Scholar 

  3. Hajduk PJ, Huth JR, Fesik SW (2005) Druggability indices for protein targets derived from NMR-based screening data. J Med Chem 48:2518–2525

    Article  CAS  Google Scholar 

  4. Lipinski CA, Lombardo F, Dominy BW, Feeney PJ (1997) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev 23:3–25

    Article  CAS  Google Scholar 

  5. Abraham MH, Chadha HS, Whiting GS, Mitchell RC (1994) Hydrogen bonding. 32. An analysis of water-octanol and water-alkane partitioning and the Δlog P parameter of Seiler. J Pharm Sci 83:1085–1100

    Article  CAS  Google Scholar 

  6. Colclough N, Hunter A, Kenny PW, Kittlety RS, Lobedan L, Tam KY, Timms MA (2008) High throughput solubility determination with application to selection of compounds for fragment screening. Bioorg Med Chem 16:6611–6616

    Article  CAS  Google Scholar 

  7. Smith DA, Di L, Kerns EH (2010) The effect of plasma protein binding on in vivo efficacy: misconceptions in drug discovery. Nat Rev Drug Discov 9:929–939

    Article  CAS  Google Scholar 

  8. Ekins S, Honeycutt JD, Metz JT (2010) Multiobjective optimization for drug discovery. In: Abraham DJ, Rotella DP (eds) Burger’s medicinal chemistry, drug discovery and development, 7th edn. Wiley, New York

    Google Scholar 

  9. Hopkins AL, Groom CR, Alex A (2004) Ligand efficiency: a useful metric for lead selection. Drug Discov Today 9:430–431

    Article  Google Scholar 

  10. van de Waterbeemd H, Smith DA, Jones BC (2001) Lipophilicity in PK design: methyl, ethyl, futile…. J Comput-Aided Mol Des 15:273–286

    Article  Google Scholar 

  11. Hann MM, Leach AR, Harper G (2001) Molecular complexity and its impact on the probability of finding leads for drug discovery. J Chem Inf Comp Sci 41:856–864

    Article  CAS  Google Scholar 

  12. Woltosz WS (2012) If we designed airplanes like we design drugs. J Comput-Aided Mol Des 26:159–163

    Article  CAS  Google Scholar 

  13. Kenny PW (2009) Hydrogen bonding, electrostatic potential and molecular design. J Chem Inf Model 49:1234–1244

    Article  CAS  Google Scholar 

  14. Rodgers JL, Nicewander WA (1988) Thirteen ways to look at the correlation coefficient. Am Stat 42:59–66

    Article  Google Scholar 

  15. Hou TJ, Xia K, Zhang W, Xu XJ (2004) ADME evaluation in drug discovery. 4. prediction of aqueous solubility based on atom contribution approach. J Chem Inf Comp Sci 44:266–275

    Article  CAS  Google Scholar 

  16. ADME/T prediction models and databases. Accessed 15 Oct 2012

  17. LOGKOW, A databank of evaluated octanol-water partition coefficients. Accessed 26 Oct 2012

  18. OEChem Toolkit Manual, OpenEye Scientific Software, Santa Fe, NM 87508. Accessed 26 Oct 2012

  19. SMARTS Theory Manual, Daylight Chemical Information Systems, Inc., Laguna Niguel, CA 92677. Accessed 16 Dec 2012

  20. JMP version 10.0.0, SAS Institute, Cary, NC 27513. Accessed 16 Dec 2012

  21. Hopkins AL, Mason JS, Overington JP (2006) Can we rationally design promiscuous drugs? Curr Opin Struct Biol 16:127–136

    Article  CAS  Google Scholar 

  22. Leeson PD, Springthorpe B (2007) The influence of drug-like concepts on decision-making in medicinal chemistry. Nat Rev Drug Discov 6:881–890

    Article  CAS  Google Scholar 

  23. Lovering F, Bikker J, Humblet C (2009) Escape from flatland: increasing saturation as an approach to improving clinical success. J Med Chem 52:6752–6756

    Article  CAS  Google Scholar 

  24. Maxwell JC (1874) Van der Waals on the continuity of gaseous and liquid states. Nature 10:477–480

    Article  Google Scholar 

  25. Gleeson MP (2008) Generation of a set of simple, interpretable ADMET rules of thumb. J Med Chem 51:817–834

    Article  CAS  Google Scholar 

  26. Tarcsay A, Kinga N, Keserű GM (2012) Impact of lipophilic efficiency on compound quality. J Med Chem 55:1252–1260

    Article  CAS  Google Scholar 

  27. Ritchie TJ, Ertl P, Lewis R (2011) The graphical representation of ADME-related molecule properties for medicinal chemists. Drug Discov Today 16:65–72

    Article  CAS  Google Scholar 

  28. Hill AP, Young RJ (2010) Getting physical in drug discovery: a contemporary perspective on solubility and hydrophobicity. Drug Discov Today 15:648–655

    Article  CAS  Google Scholar 

  29. Ritchie TJ, MacDonald SJF (2009) The impact of aromatic ring count on compound developability: are too many aromatic rings a liability in drug design? Drug Discov Today 14:1011–1020

    Article  CAS  Google Scholar 

  30. Kenny PW (2012) Computation, experiment and molecular design. J Comput-Aided Mol Des 26:69–72

    Article  CAS  Google Scholar 

  31. Johnstone C (2012) Medicinal chemistry matters—a call for discipline in our discipline. Drug Discov Today 17:538–543

    Article  CAS  Google Scholar 

  32. Stahl M, Bajorath J (2011) Computational medicinal chemistry. J Med Chem 54:1–2

    Article  CAS  Google Scholar 

Download references


We thank Anthony Nicholls for valuable advice and the reviewers of the manuscript for their helpful and constructive feedback. We are grateful to Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP) and Conselho Nacional de Pesquisa (CNPq) for financial support and OpenEye Scientific Software for an academic software license.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Peter W. Kenny.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (TXT 626 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kenny, P.W., Montanari, C.A. Inflation of correlation in the pursuit of drug-likeness. J Comput Aided Mol Des 27, 1–13 (2013).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: