Principal Component Analysis as a Tool for Library Design: A Case Study Investigating Natural Products, Brand-Name Drugs, Natural Product-Like Libraries, and Drug-Like Libraries

  • Todd A. Wenderski
  • Christopher F. Stratton
  • Renato A. Bauer
  • Felix Kopp
  • Derek S. Tan
Part of the Methods in Molecular Biology book series (MIMB, volume 1263)


Principal component analysis (PCA) is a useful tool in the design and planning of chemical libraries. PCA can be used to reveal differences in structural and physicochemical parameters between various classes of compounds by displaying them in a convenient graphical format. Herein, we demonstrate the use of PCA to gain insight into structural features that differentiate natural products, synthetic drugs, natural product-like libraries, and drug-like libraries, and show how the results can be used to guide library design.

Key words

Principal component analysis (PCA) Medium rings Macrocycles Ring expansion Natural products Drugs Libraries Diversity-oriented synthesis 



We thank Tony D. Davis (MSKCC) for suggesting inclusion of the logD, van der Waals surface area, and relative polar surface area parameters, and for providing modifications of this protocol for Windows users. Instant JChem was generously provided by ChemAxon. Financial support from the NIH (P41 GM076267 to D.S.T., P41 GM076267-03S1 to R.A.B., T32 CA062948-Gudas to T.A.W.), Starr Foundation, Tri-Institutional Stem Cell Initiative, Alfred P. Sloan Foundation (Research Fellowship to D.S.T.), Deutscher Akademischer Austauschdienst (DAAD, postdoctoral fellowship to F.K.), William H. Goodwin and Alice Goodwin and the Commonwealth Foundation for Cancer Research, and the MSKCC Experimental Therapeutics Center is gratefully acknowledged.


  1. 1.
    Jolliffe IT (2002) Principal component analysis. Springer, New York, NYGoogle Scholar
  2. 2.
    Jackson JE (2003) A user’s guide to principal components. Wiley, Hoboken, NJGoogle Scholar
  3. 3.
    Akella LB, DeCaprio D (2010) Cheminformatics approaches to analyze diversity in compound screening libraries. Curr Opin Chem Biol 14:325–330PubMedCrossRefGoogle Scholar
  4. 4.
    Overington JP, Al-Lazikani B, Hopkins AL (2006) How many drug targets are there? Nat Rev Drug Discov 5:993–996PubMedCrossRefGoogle Scholar
  5. 5.
    Newman DJ, Cragg GM (2012) Natural products as sources of new drugs over the 30 years from 1981 to 2010. J Nat Prod 75:311–335PubMedCentralPubMedCrossRefGoogle Scholar
  6. 6.
    Sánchez-Pedregal VM et al (2006) The tubulin-bound conformation of discodermolide derived by NMR studies in solution supports a common pharmacophore model for epothilone and discodermolide. Angew Chem Int Ed 45:7388–7394CrossRefGoogle Scholar
  7. 7.
    Canales A et al (2008) The bound conformation of microtubule-stabilizing agents: NMR insights into the bioactive 3D structure of discodermolide and dictyostatin. Chem Eur J 14:7557–7569PubMedCrossRefGoogle Scholar
  8. 8.
    Knust J, Hoffmann RW (2003) Synthesis and conformational analysis of macrocyclic dilactones mimicking the pharmacophore of aplysiatoxin. Helv Chim Acta 86:1871–1893CrossRefGoogle Scholar
  9. 9.
    Khan AR et al (1998) Lowering the entropic barrier for binding conformationally flexible inhibitors to enzymes. Biochemistry 37:16839–16845PubMedCrossRefGoogle Scholar
  10. 10.
    Veber DF et al (2002) Molecular properties that influence the oral bioavailability of drug candidates. J Med Chem 45:2615–2623PubMedCrossRefGoogle Scholar
  11. 11.
    Rezai T et al (2007) Testing the conformational hypothesis of passive membrane permeability using synthetic cyclic peptide diastereomers. J Am Chem Soc 128:2510–2511CrossRefGoogle Scholar
  12. 12.
    Kopp F et al (2012) A diversity-oriented synthesis approach to macrocycles via oxidative ring expansion. Nat Chem Biol 8:358–365PubMedCentralPubMedCrossRefGoogle Scholar
  13. 13.
    Bauer RA, Wenderski TA, Tan DS (2013) Biomimetic diversity-oriented synthesis of benzannulated medium rings via ring expansion. Nat Chem Biol 9:21–29PubMedCentralPubMedCrossRefGoogle Scholar
  14. 14.
    Lipinski CA et al (1997) Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv Drug Deliv Rev 23:3–25CrossRefGoogle Scholar
  15. 15.
    Tetko IV et al (2001) Estimation of aqueous solubility of chemical compounds using E-state indices. J Chem Inf Comput Sci 41:1488–1493PubMedCrossRefGoogle Scholar
  16. 16.
    Feher M, Schmidt JM (2003) Property distributions: differences between drugs, natural products, and molecules from combinatorial chemistry. J Chem Inf Comput Sci 43:218–227PubMedCrossRefGoogle Scholar
  17. 17.
    Lovering F, Bikker J, Humblet C (2009) Escaping from flatland: increasing saturations as an approach to improving clinical success. J Med Chem 52:6752–6756PubMedCrossRefGoogle Scholar
  18. 18.
    Clemons PA et al (2010) Small molecules of different origins have distinct distributions of structural complexity that correlate with protein-binding profiles. Proc Natl Acad Sci U S A 107:18787–18792PubMedCentralPubMedCrossRefGoogle Scholar
  19. 19.
    O’Shea R, Moser HM (2008) Physicochemical properties of antibacterial compounds: implications for drug discovery. J Med Chem 51:2871–2878PubMedCrossRefGoogle Scholar
  20. 20.
    McGrath NA, Brichacek M, Njardarson JT (2010) A graphical journey of innovative organic architectures that have improved our lives. J Chem Educ 87:1348–1349, Also see also: Njardarson Group—Top top-selling drugs Pharmaceuticals poster;

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Todd A. Wenderski
    • 1
  • Christopher F. Stratton
    • 2
  • Renato A. Bauer
    • 2
  • Felix Kopp
    • 1
  • Derek S. Tan
    • 1
    • 2
  1. 1.Molecular Pharmacology & Chemistry ProgramMemorial Sloan Kettering Cancer CenterNew YorkUSA
  2. 2.Tri-Institutional Ph.D. Program in Chemical BiologyMemorial Sloan Kettering Cancer CenterNew YorkUSA

Personalised recommendations