Journal of Computer-Aided Molecular Design

, Volume 29, Issue 10, pp 937–950 | Cite as

Design of chemical space networks using a Tanimoto similarity variant based upon maximum common substructures

  • Bijun Zhang
  • Martin Vogt
  • Gerald M. Maggiora
  • Jürgen Bajorath


Chemical space networks (CSNs) have recently been introduced as an alternative to other coordinate-free and coordinate-based chemical space representations. In CSNs, nodes represent compounds and edges pairwise similarity relationships. In addition, nodes are annotated with compound property information such as biological activity. CSNs have been applied to view biologically relevant chemical space in comparison to random chemical space samples and found to display well-resolved topologies at low edge density levels. The way in which molecular similarity relationships are assessed is an important determinant of CSN topology. Previous CSN versions were based on numerical similarity functions or the assessment of substructure-based similarity. Herein, we report a new CSN design that is based upon combined numerical and substructure similarity evaluation. This has been facilitated by calculating numerical similarity values on the basis of maximum common substructures (MCSs) of compounds, leading to the introduction of MCS-based CSNs (MCS-CSNs). This CSN design combines advantages of continuous numerical similarity functions with a robust and chemically intuitive substructure-based assessment. Compared to earlier version of CSNs, MCS-CSNs are characterized by a further improved organization of local compound communities as exemplified by the delineation of drug-like subspaces in regions of biologically relevant chemical space.


Chemical space networks Maximum common substructures Tanimoto similarity Biologically relevant chemical space Drug-like subspaces Network science 


  1. 1.
    Dobson CM (2004) Chemical space and biology. Nature 432:824–828CrossRefGoogle Scholar
  2. 2.
    Oprea TI, Gottfries J (2001) Chemography: the art of navigating chemical space. J Comb Chem 3:157–166CrossRefGoogle Scholar
  3. 3.
    Lowe D (2015) Chemical space is big. Really big. Med Chem Commun 6:12CrossRefGoogle Scholar
  4. 4.
    Bohacek RS, McMartin C, Guida WC (1996) The art and practice of structure-based drug design: a molecular modelling perspective. Med Res Rev 16:3–50CrossRefGoogle Scholar
  5. 5.
    Pearlman R, Smith K (2002) Novel software tools for chemical diversity. In: Kubinyi H, Folkers G, Martin YC (eds) 3D QSAR in drug design: three-dimensional quantitative structure–activity relationships, vol 2. Springer, Berlin, pp 339–353Google Scholar
  6. 6.
    Harris CJ, Hill RD, Sheppard DW, Slater MJ, Stouten PFW (2011) The design and application of target-focused compound libraries. Comb Chem High Throughput Screen 14:521–531CrossRefGoogle Scholar
  7. 7.
    Maggiora GM, Vogt M, Stumpfe D, Bajorath J (2014) Molecular similarity in medicinal chemistry. J Med Chem 57:3186–3204CrossRefGoogle Scholar
  8. 8.
    Osolodkin DI, Radchenko EV, Orlov AA, Voronkov AE, Palyulin VA, Zefirov NS (2015) Progress in visual representations of chemical space. Expert Opin Drug Discov 10:959–973Google Scholar
  9. 9.
    Maggiora GM, Bajorath J (2014) Chemical space networks—a powerful new paradigm for the description of chemical space. J Comput Aided Mol Des 28:795–802CrossRefGoogle Scholar
  10. 10.
    Newman M (2010) Networks—an introduction. Oxford University Press, New YorkCrossRefGoogle Scholar
  11. 11.
    Albert R, Barabási A (2002) Statistical mechanics of complex networks. Rev Mod Phys 74:47–97CrossRefGoogle Scholar
  12. 12.
    Wawer M, Peltason L, Weskamp N, Teckentrup A, Bajorath J (2008) Structure-activity relationship anatomy by network-like similarity graphs and local structure-activity relationship indices. J Med Chem 51:6075–6084CrossRefGoogle Scholar
  13. 13.
    Tanaka N, Ohno K, Niimi T, Moritomo A, Mori K, Orita M (2009) Small-world phenomena in chemical library networks: application to fragment-based drug discovery. J Chem Inf Model 49:2677–2686CrossRefGoogle Scholar
  14. 14.
    Krein MP, Sukumar N (2011) Exploration of the topology of chemical spaces with network measures. J Phys Chem A 115:12905–12918CrossRefGoogle Scholar
  15. 15.
    Fourches D, Tropsha A (2013) Using graph indices for the analysis and comparison of chemical data sets. Mol Inf 32:827–842CrossRefGoogle Scholar
  16. 16.
    Zwierzyna M, Vogt M, Maggiora GM, Bajorath J (2015) Design and characterization of chemical space networks for different compound data sets. J Comput Aided Mol Des 29:113–125CrossRefGoogle Scholar
  17. 17.
    MACCS structural keys. Accelrys, San Diego, CAGoogle Scholar
  18. 18.
    Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754CrossRefGoogle Scholar
  19. 19.
    Maggiora GM, Shanmugasundaram V (2004) Molecular similarity measures. In: Bajorath J (ed) Chemoinformatics—concepts, methods, and tools for drug discovery. Humana Press, TotowaGoogle Scholar
  20. 20.
    McPherson M, Smith-Lovin L, Cook J (2001) Birds of a feather: homophily in social networks. Annu Rev Sociol 27:415–444CrossRefGoogle Scholar
  21. 21.
    Kenny PW, Sadowski J (2005) Structure modification in chemical databases. In: Oprea TI (ed) Chemoinformatics in drug discovery. Wiley-VCH, Weinheim, pp 271–285CrossRefGoogle Scholar
  22. 22.
    Hussain J, Rea C (2010) Computationally efficient algorithm to identify matched molecular pairs (MMPs) in large data sets. J Chem Inf Model 50:339–348CrossRefGoogle Scholar
  23. 23.
    Hu X, Hu Y, Vogt M, Stumpfe D, Bajorath J (2012) MMP-cliffs: systematic identification of activity cliffs on the basis of matched molecular pairs. J Chem Inf Model 52:1138–1145CrossRefGoogle Scholar
  24. 24.
    Stumpfe D, Hu Y, Dimova D, Bajorath J (2014) Recent progress in understanding activity cliffs and their utility in medicinal chemistry. J Med Chem 57:18–28CrossRefGoogle Scholar
  25. 25.
    Zhang B, Vogt M, Maggiora GM, Bajorath J (2015) Comparison of bioactive chemical space networks generated using substructure- and fingerprint-based measures of molecular similarity. J Comput Aided Mol Des 29:595–608CrossRefGoogle Scholar
  26. 26.
    Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40(Database issue):D1100–D1107CrossRefGoogle Scholar
  27. 27.
    Law V, Knox C, Djoumbou Y, Jewison T, Guo AC, Liu Y, Maciejewski A, Arndt D, Wilson M, Neveu V, Tang A, Gabriel G, Ly C, Adamjee S, Dame ZT, Han B, Zhou Y, Wishart DS (2014) DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res 42(Database issue):D1091–D1097CrossRefGoogle Scholar
  28. 28.
    UniProt Consortium (2010) The universal protein resource (UniProt) in 2010. Nucleic Acids Res 38(Database issue):142–148. doi:10.1093/nar/gkp846 CrossRefGoogle Scholar
  29. 29.
    OEChem TK version 2.0.0; OpenEye Scientific Software, Santa Fe, NM.
  30. 30.
    Raymond W, Willett P (2002) Effectiveness of graph-based and fingerprint-based similarity measures for virtual screening of 2D chemical structure databases. J Comput Aided Mol Des 16:59–71CrossRefGoogle Scholar
  31. 31.
    Java Universal Network/Graph Framework. Accessed 12 Oct 2014
  32. 32.
    Fruchterman TMJ, Reingold EM (1991) Graph drawing by force-directed placement. Softw Pract Exp 21:1129–1164CrossRefGoogle Scholar
  33. 33.
    Newman M (2004) Fast algorithm for detecting community structure in networks. Phys Rev E 69:066133CrossRefGoogle Scholar
  34. 34.
    Noack A (2009) Modularity clustering is force-directed layout. Phys Rev E 79:026102CrossRefGoogle Scholar
  35. 35.
    Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice Hall, Englewood CliffsGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Bijun Zhang
    • 1
  • Martin Vogt
    • 1
  • Gerald M. Maggiora
    • 2
    • 3
  • Jürgen Bajorath
    • 1
  1. 1.Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal ChemistryRheinische Friedrich-Wilhelms-UniversitätBonnGermany
  2. 2.BIO5 Institute, University of ArizonaTucsonUSA
  3. 3.Translational Genomics Research InstitutePhoenixUSA

Personalised recommendations