Visualisation and subsets of the chemical universe database GDB-13 for virtual screening
- 418 Downloads
The chemical universe database GDB-13, which enumerates 977 million organic molecules up to 13 atoms of C, N, O, S and Cl following simple chemical stability and synthetic feasibility rules, represents a vast reservoir for new fragments. GDB-13 was classified using the MQN-system discussed in the preceding paper for the analysis of PubChem fragments. Two hundred and fifty-five subsets of GDB-13 were generated by the combinatorial use of eight restrictive criteria, including fragment-like (“rule of three”) and scaffold-like (no acyclic carbon atoms) filters. Virtual screening for analogs of 15 commercial drugs of 13 non-hydrogen atoms or less shows that retrieving MQN-neighbors of a query molecule from GDB-13 or its subsets provides on average a 38-fold enrichment in structural analogs (Daylight-type substructure fingerprint Tanimoto T SF > 0.7), and a 75-fold enrichment in shape-similar analogs (ROCS TanimotoCombo score > 1.4). An MQN-searchable version of GDB-13 is provided at www.gdb.unibe.ch.
KeywordsDatabases Virtual screening Chemical space Enumeration Fragments
This work was supported financially by the University of Berne, the Swiss National Science Foundation and the Office Fédéral Suisse de l’Education et de la Science.
- 10.van Deursen R, Blum LC, Reymond JL (2011) Visualisation of the chemical space of fragments, lead-like and drug-like molecules in PubChem. J Comput Aided Mol Des. doi: 10.1007/s10822-011-9437-x
- 13.Fink T, Reymond JL (2007) Virtual exploration of the chemical universe up to 11 atoms of C, N, O, F: assembly of 26.4 million structures (110.9 million stereoisomers) and analysis for new ring systems, stereochemistry, physicochemical properties, compound classes, and drug discovery. J Chem Inf Model 47:342–353CrossRefGoogle Scholar
- 26.Willett P, Barnard JM, Downs GM (1998) Chemical similarity searching. J Chem Inf Comput Sci 38:983–996Google Scholar
- 28.McKay BD (1981) Practical graph isomorphism. Congr Numerantium 30:45–87Google Scholar
- 39.Jolliffe IT (2002) Principal component analysis, 2nd edn. Springer, New YorkGoogle Scholar