Abstract
The selection of a sample of diverse compounds is a common strategy for exploring large molecular libraries. However, the success of such approach depends on the selection of relevant molecular descriptors and the use of appropriate sampling methods. In the context of pharmaceutical research, the molecular descriptors should be based on physicochemical properties related with the pharmacological behaviour of the compounds. In this sense, the alignment-free GRIND and VolSurf molecular descriptors are promising candidates since they have been successfully used in the modelling of both pharmacodynamic and pharmacokinetic properties of drugs. This work describes the use of such descriptors in the diversity sampling of a library of primary amines and compares the results with those obtained in a previous study that used quantum-mechanical descriptors. As in the previous work, principal component (PC) analysis was applied to reduce the dimensionality and remove redundant information of the original descriptors, and the compounds were sampled on the basis of k-means clustering on the space of the selected PCs. The results of the present study show that VolSurf and GRIND provide similar quality sampling regarding global features of the molecules such as hydrophilicity, however the topology of the compounds is considered differently. The similarity between particular compounds strongly depends on the original descriptors used. However all the sample selections done in the PC space after k-means clustering provide the same apparent diversity in comparison to the whole dataset. The results indicate that there is no best set of descriptors on a diversity basis. The selection of descriptors must be based on the drug features to be investigated.
Similar content being viewed by others
References
Potter, T. and Matter, H., Random or rational design? Evaluation of diverse compound subsets from chemical structure databases, J. Med. Chem., 41 (1998) 478–488.
Zheng, W., Cho, S. J., Waller, C. L. and Tropsha, A., Rational combinatorial library design. 3. Simulated annealing guided evaluation (SAGE) of molecular diversity: A novel computational tool for universal library design and database mining, J. Chem. Inf. Comput. Sci., 39 (1999) 738–746.
Blaney, J. M. and Martin, E. J., Computational approaches for combinatorial library design and molecular diversity analysis, Curr. Opin. Chem. Biol., 1 (1997) 54–59.
Bures, M. G. and Martin, Y. C., Computational methods in molecular diversity and combinatorial chemistry, Curr. Opin. Chem. Biol., 2 (1998) 376–380.
Gorse, D. and Lahana, R., Functional diversity of compound libraries, Curr. Opin. Chem. Biol., 4 (2000) 287–294.
Willett, P., Chemoinformatics-Similarity and diversity in chemical libraries, Curr. Opin. Biotechnol., 11 (2000) 85–88.
Tropsha, A. and Zheng, W., Rational principles of compound selection for combinatorial library design, Comb.Chem.High Throughput Screen., 5 (2002) 111–123.
Beavers, M. P. and Chen, X., Structure-based combinatorial library design: Methodologies and applications, J. Mol. Graph. Model., 20 (2002) 463–468.
Martin, Y. C., Diverse viewpoints on computational aspects of molecular diversity, J. Comb. Chem., 3 (2001) 231–250.
Gutiérrez-de-Terán, H., Lozano, J. J., Segarra, V. and Sanz, F., Molecular diversity sample generation on the basis of quantum-mechanical computations and principal component analysis, Comb. Chem. High Throughput Screen., 5 (2002) 49–57.
Gillet, V. J., Background theory of molecular diversity, In Dean, P. M. and Lewis, R. A. (eds.), Molecular Diversity in Drug Design, Kluwer Academic Publishers, Dordrecht, 1999, pp. 43–65.
Xue, L. and Bajorath, J., Molecular descriptors in chemoinformatics, computational combinatorial chemistry, and virtual screening, Comb. Chem. High Throughput Screen., 3 (2000) 363–372.
Mason, J. S. and Beno, B. R., Library design using BCUT chemistry-space descriptors and multiple four-point pharmacophore fingerprints: Simultaneous optimization and structure-based diversity, J. Mol. Graph. Model., 18 (2000) 438–451, 538.
Dixon, S. L. and Villar, H. O., Bioactive diversity and screening library selection via affinity fingerprinting, J. Chem. Inf. Comput. Sci., 38 (1998) 1192–1203.
Lipkus, A. H., Exploring chemical rings in a simple topological-descriptor space, J. Chem. Inf. Comput. Sci., 41 (2001) 430–438.
Barnard, J. M., Downs, G. M., Von Scholley-Pfab, A. and Brown, R. D., Use of Markush structure analysis techniques for descriptor generation and clustering of large combinatorial libraries, J. Mol. Graph. Model., 18 (2000) 452–463.
Ivanciuc, O. and Klein, D. J., Computing wiener-type indices for virtual combinatorial libraries generated from heteroatomcontaining building blocks, J. Chem. Inf. Comput. Sci., 42 (2002) 8–22.
Rarey, M. and Stahl, M., Similarity searching in large combinatorial chemistry spaces, J. Comput. Aided Mol. Des., 15 (2001) 497–520.
Consonni, V., Todeschini, R. and Pavan, M., Structure/response correlations and similarity/diversity analysis by GETAWAY descriptors. 1. Theory of the novel 3-D molecular descriptors, J. Chem. Inf. Comput. Sci., 42 (2002) 682–692.
Makara, G.M., Measuring molecular similarity and diversity: total pharmacophore diversity, J. Med. Chem., 44 (2001) 3563–3571.
Goodford, P. J., A computational procedure for determining energetically favorable binding sites on biologically important macromolecules, J. Med. Chem., 28 (1985) 849–857.
Cramer, R. D., Patterson, D. E. and Bunce, J. D., Comparative Molecular Field Analysis (CoMFA): 1. Effect of shape on binding of steroids to carrier proteins, J. Am. Chem. Soc., 110 (1988) 5959–5967.
Cruciani, G. and Watson, K. A., Comparative molecular field analysis using GRID force-field and GOLPE variable selection methods in a study of inhibitors of glycogen phosphorylase b, J. Med. Chem., 37 (1994) 2589–2601.
Cruciani, G., Pastor, M. and Guba, W., VolSurf: A new tool for the pharmacokinetic optimization of lead compounds, Eur. J. Pharm. Sci., 11 Suppl 2 (2000) S29–39.
Crivori, P., Cruciani, G., Carrupt, P. A. and Testa, B., Predicting blood-brain barrier permeation from three-dimensional molecular structure, J. Med. Chem., 43 (2000) 2204–2216.
Zamora, I., Oprea, T., Cruciani, G., Pastor, M. and Ungell, A. L., Surface descriptors for protein-ligand affinity prediction, J. Med. Chem., 46 (2003) 25–33.
Pastor, M., Cruciani, G., McLay, I., Pickett, S. and Clementi, S., GRid-INdependent descriptors (GRIND): A.a novel class of alignment-independent three-dimensional molecular descriptors, J. Med. Chem., 43 (2000) 3233–3243.
Benedetti, P., Mannhold, R., Cruciani, G. and Pastor, M., GBR compounds and mepyramines as cocaine abuse therapeutics: Chemometric studies on selectivity using grid independent descriptors (GRIND), J. Med. Chem., 45 (2002) 1577–1584.
Afzelius, L., Masimirembwa, C. M., Karlen, A., Andersson, T. B. and Zamora, I., Discriminant and quantitative PLS analysis of competitive CYP2C9 inhibitors versus non-inhibitors using alignment independent GRIND descriptors, J. Comput. Aided Mol. Des., 16 (2002) 443–458.
Cruciani, G., Pastor, M. and Mannhold, R., Suitability of molecular descriptors for database mining. A comparative analysis, J. Med. Chem., 45 (2002) 2685–2694.
Oprea, T. I., Zamora, I. and Ungell, A. L., Pharmacokinetically based mapping device for chemical space navigation, J. Comb. Chem., 4 (2002) 258–266.
Gasteiger, J., Rudolph, C. and Sadowski, J., Automatic generation of 3-D atomic coordinates for organic molecules, Tetrahedron Comp. Method., 3 (1990) 537–547.
Giesen, D. J., Gu, M. Z., Cramer, C. J. and Truhlar, D. G., A Universal Organic Solvation Model, J. Org. Chem., 61 (1996) 8720–8721.
AMSOL 6.5.2, Hawkins, G. D., Giesen, D. J., G. C., L., Chambers, C. C., Rossi, I., Storer, J. W., Rinaldi, D., Liotard, D. A., Cramer, C. J. and Truhlar, D. G., University of Minnesota, Minneapolis, 1997.
VolSurf 3.0.7c, Cruciani, G., Pastor, M. and Mecucci, S., Molecular Discovery Ltd., Perugia, 2002.
Almond 3.2.0, Cruciani, G., Fontaine, F. and Pastor, M., Molecular Discovery Ltd., Perugia, 2003.
Carey, R. N., Wold, S. and Westgard, J. O., Principal component analysis: an alternative to 'referee' methods in method comparison studies, Anal. Chem., 47 (1975) 1824–1829.
SPSS 11.0.1, SPSS inc. Chicago, 2001.
Downs, G. M. and Barnard, J. M., Clustering methods and their uses in computational chemistry, In Lipkowitz, K. B. and Boyd, D. B. (eds.), Reviews in Computational Chemistry, Wiley-VCH, John Wiley & Sons, Inc., 2002, pp. 1–40.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fontaine, F., Pastor, M., Gutiérrez-de-Terán, H. et al. Use of alignment-free molecular descriptors in diversity analysis and optimal sampling of molecular libraries. Mol Divers 6, 135–147 (2003). https://doi.org/10.1023/B:MODI.0000006840.89805.e1
Issue Date:
DOI: https://doi.org/10.1023/B:MODI.0000006840.89805.e1