Abstract
Chemical Space Networks (CSNs) are generated for different compound data sets on the basis of pairwise similarity relationships. Such networks are thought to complement and further extend traditional coordinate-based views of chemical space. Our proof-of-concept study focuses on CSNs based upon fingerprint similarity relationships calculated using the conventional Tanimoto similarity metric. The resulting CSNs are characterized with statistical measures from network science and compared in different ways. We show that the homophily principle, which is widely considered in the context of social networks, is a major determinant of the topology of CSNs of bioactive compounds, designed as threshold networks, typically giving rise to community structures. Many properties of CSNs are influenced by numerical features of the conventional Tanimoto similarity metric and largely dominated by the edge density of the networks, which depends on chosen similarity threshold values. However, properties of different CSNs with constant edge density can be directly compared, revealing systematic differences between CSNs generated from randomly collected or bioactive compounds.
Similar content being viewed by others
References
Dobson CM (2004) Chemical space and biology. Nature 432:824–828
Bohacek RS, McMartin C, Guida WC (1996) The art and practice of structure-based drug design: a molecular modelling perspective. Med Res Rev 16:3–50
Maggiora GM, Bajorath J (2014) Chemical space networks—a poweful new paradigm for the description of chemical space. J Comput-Aided Mol Des 28:795–802
Pearlman R, Smith K (2002) Novel software tools for chemical diversity. 3D QSAR in drug design. Three-dimens Quant Struct-Act Relat 2:339–353
Maggiora GM, Vogt M, Stumpfe D, Bajorath J (2014) Molecular similarity in medicinal chemistry. J Med Chem 57:3186–3204
Wawer M, Peltason L, Weskamp N, Teckentrup A, Bajorath J (2008) Structure-activity relationship anatomy by network-like similarity graphs and local structure-activity relationship indices. J Med Chem 51:6075–6084
Tanaka N, Ohno K, Niimi T, Moritomo A, Mori K, Orita M (2009) Small-world phenomena in chemical library networks: application to fragment-based drug discovery. J Chem Inf Model 49:2677–2686
Krein MP, Sukumar N (2011) Exploration of the topology of chemical spaces with network measures. J Phys Chem A 115:12905–12918
Fourches D, Tropsha A (2013) Using graph indices for the analysis and comparison of chemical data sets. Mol Inf 32:827–842
Stumpfe D, Dimova D, Bajorath J (2014) Composition and topology of activity cliff clusters formed by bioactive compounds. J Chem Inf Model 54:451–461
Watts D, Strogatz S (1998) Collective dynamics of ‘small-world’ networks. Nature 393:440–442
Barabási A, Albert R (1999) Emergence of scaling in random networks. Science 286:509–512
Newman M (2010) Networks—an introduction. Oxford University Press, New York
Newman M (2003) The structure and function of complex networks. SIAM Rev 45:167–256
Albert R, Barabási A (2002) Statistical mechanics of complex networks. Rev Mod Phys 74:47–97
McPherson M, Smith-Lovin L, Cook J (2001) Birds of a feather: homophily in social networks. Annu Rev Sociol 27:415–444
Willett P, Barnard JM, Downs GM (1998) Chemical similarity searching. J Chem Inf Comput Sci 38:983–996
Newman M, Park J (2003) Why social networks are different from other types of networks. Phys Rev E 68:036122
Foster D, Foster J, Grassberger P, Paczuski M (2011) Clustering drives assortativity and community structure in ensembles of networks. Phys Rev E 84:066117
Newman M (2004) Fast algorithm for detecting community structure in networks. Phys Rev E 69:066133
Irwin JJ, Sterling T, Mysinger MM, Bolstad ES, Coleman RG (2012) ZINC: a free tool to discover chemistry for biology. J Chem Inf Model 52:1757–1768
Willett P (1999) Dissimilarity-based algorithms for selecting structurally diverse sets of compounds. J Comput Biol 6:447–457
MACCS Structural Keys; Accelrys, San Diego
Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP (2012) ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res 40(Database issue):D1100–D1107
Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754
Java Universal Network/Graph Framework. http://jung.sourceforge.net. Accessed 12 Oct 2014
Gavin A-C, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon A-M, Cruciat C-M (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415:141–147
Acknowledgments
M. Z. has partly been supported by the German Academic Exchange Service (Deutscher Akademischer Austauschdienst, DAAD).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zwierzyna, M., Vogt, M., Maggiora, G.M. et al. Design and characterization of chemical space networks for different compound data sets. J Comput Aided Mol Des 29, 113–125 (2015). https://doi.org/10.1007/s10822-014-9821-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-014-9821-4