Journal of Computer-Aided Molecular Design

, Volume 25, Issue 7, pp 649–662

Visualisation of the chemical space of fragments, lead-like and drug-like molecules in PubChem

  • Ruud van Deursen
  • Lorenz C. Blum
  • Jean-Louis Reymond
Article

DOI: 10.1007/s10822-011-9437-x

Cite this article as:
van Deursen, R., Blum, L.C. & Reymond, JL. J Comput Aided Mol Des (2011) 25: 649. doi:10.1007/s10822-011-9437-x

Abstract

The 4.5 million organic molecules with up to 20 non-hydrogen atoms in PubChem were analyzed using the MQN-system, which consists in 42 integer value descriptors of molecular structure. The 42-dimensional MQN-space was visualised by principal component analysis and representation of the (PC1, PC2), (PC1, PC3) and (PC2, PC3) planes. The molecules were organized according to ring count (PC1, 38% of variance), the molecular size (PC2, 25% of variance), and the H-bond acceptor count (PC3, 12% of variance). Compounds following Lipinski’s bioavailability, Oprea’s lead-likeness and Congreve’s fragment-likeness criteria formed separated groups in MQN-space visible in the (PC2, PC3) plane. MQN-similarity searches of the 4.5 million molecules (see the browser available at www.gdb.unibe.ch) gave significant enrichment factors for recovering groups of fragment-sized bioactive compounds related to ten different biological targets taken from Chembl, allowing lead-hopping relationships not seen with substructure fingerprint similarity searches. The diversity of different compound series was analyzed by MQN-distance histograms.

Graphical Abstract

Keywords

PubChem Fragments Chemical space Virtual screening 

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  • Ruud van Deursen
    • 1
  • Lorenz C. Blum
    • 1
  • Jean-Louis Reymond
    • 1
  1. 1.Department of Chemistry and Biochemistry, Swiss National Center of Competence in Research, NCCR-TransCureUniversity of BerneBerneSwitzerland

Personalised recommendations