Skip to main content
Log in

Use of alignment-free molecular descriptors in diversity analysis and optimal sampling of molecular libraries

  • Published:
Molecular Diversity Aims and scope Submit manuscript

Abstract

The selection of a sample of diverse compounds is a common strategy for exploring large molecular libraries. However, the success of such approach depends on the selection of relevant molecular descriptors and the use of appropriate sampling methods. In the context of pharmaceutical research, the molecular descriptors should be based on physicochemical properties related with the pharmacological behaviour of the compounds. In this sense, the alignment-free GRIND and VolSurf molecular descriptors are promising candidates since they have been successfully used in the modelling of both pharmacodynamic and pharmacokinetic properties of drugs. This work describes the use of such descriptors in the diversity sampling of a library of primary amines and compares the results with those obtained in a previous study that used quantum-mechanical descriptors. As in the previous work, principal component (PC) analysis was applied to reduce the dimensionality and remove redundant information of the original descriptors, and the compounds were sampled on the basis of k-means clustering on the space of the selected PCs. The results of the present study show that VolSurf and GRIND provide similar quality sampling regarding global features of the molecules such as hydrophilicity, however the topology of the compounds is considered differently. The similarity between particular compounds strongly depends on the original descriptors used. However all the sample selections done in the PC space after k-means clustering provide the same apparent diversity in comparison to the whole dataset. The results indicate that there is no best set of descriptors on a diversity basis. The selection of descriptors must be based on the drug features to be investigated.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Potter, T. and Matter, H., Random or rational design? Evaluation of diverse compound subsets from chemical structure databases, J. Med. Chem., 41 (1998) 478–488.

    Article  CAS  Google Scholar 

  2. Zheng, W., Cho, S. J., Waller, C. L. and Tropsha, A., Rational combinatorial library design. 3. Simulated annealing guided evaluation (SAGE) of molecular diversity: A novel computational tool for universal library design and database mining, J. Chem. Inf. Comput. Sci., 39 (1999) 738–746.

    Article  CAS  Google Scholar 

  3. Blaney, J. M. and Martin, E. J., Computational approaches for combinatorial library design and molecular diversity analysis, Curr. Opin. Chem. Biol., 1 (1997) 54–59.

    Article  CAS  Google Scholar 

  4. Bures, M. G. and Martin, Y. C., Computational methods in molecular diversity and combinatorial chemistry, Curr. Opin. Chem. Biol., 2 (1998) 376–380.

    Article  CAS  Google Scholar 

  5. Gorse, D. and Lahana, R., Functional diversity of compound libraries, Curr. Opin. Chem. Biol., 4 (2000) 287–294.

    Article  CAS  Google Scholar 

  6. Willett, P., Chemoinformatics-Similarity and diversity in chemical libraries, Curr. Opin. Biotechnol., 11 (2000) 85–88.

    Article  CAS  Google Scholar 

  7. Tropsha, A. and Zheng, W., Rational principles of compound selection for combinatorial library design, Comb.Chem.High Throughput Screen., 5 (2002) 111–123.

    CAS  Google Scholar 

  8. Beavers, M. P. and Chen, X., Structure-based combinatorial library design: Methodologies and applications, J. Mol. Graph. Model., 20 (2002) 463–468.

    Article  CAS  Google Scholar 

  9. Martin, Y. C., Diverse viewpoints on computational aspects of molecular diversity, J. Comb. Chem., 3 (2001) 231–250.

    Article  CAS  Google Scholar 

  10. Gutiérrez-de-Terán, H., Lozano, J. J., Segarra, V. and Sanz, F., Molecular diversity sample generation on the basis of quantum-mechanical computations and principal component analysis, Comb. Chem. High Throughput Screen., 5 (2002) 49–57.

    Google Scholar 

  11. Gillet, V. J., Background theory of molecular diversity, In Dean, P. M. and Lewis, R. A. (eds.), Molecular Diversity in Drug Design, Kluwer Academic Publishers, Dordrecht, 1999, pp. 43–65.

    Google Scholar 

  12. Xue, L. and Bajorath, J., Molecular descriptors in chemoinformatics, computational combinatorial chemistry, and virtual screening, Comb. Chem. High Throughput Screen., 3 (2000) 363–372.

    CAS  Google Scholar 

  13. Mason, J. S. and Beno, B. R., Library design using BCUT chemistry-space descriptors and multiple four-point pharmacophore fingerprints: Simultaneous optimization and structure-based diversity, J. Mol. Graph. Model., 18 (2000) 438–451, 538.

    Article  CAS  Google Scholar 

  14. Dixon, S. L. and Villar, H. O., Bioactive diversity and screening library selection via affinity fingerprinting, J. Chem. Inf. Comput. Sci., 38 (1998) 1192–1203.

    Article  CAS  Google Scholar 

  15. Lipkus, A. H., Exploring chemical rings in a simple topological-descriptor space, J. Chem. Inf. Comput. Sci., 41 (2001) 430–438.

    Article  CAS  Google Scholar 

  16. Barnard, J. M., Downs, G. M., Von Scholley-Pfab, A. and Brown, R. D., Use of Markush structure analysis techniques for descriptor generation and clustering of large combinatorial libraries, J. Mol. Graph. Model., 18 (2000) 452–463.

    CAS  Google Scholar 

  17. Ivanciuc, O. and Klein, D. J., Computing wiener-type indices for virtual combinatorial libraries generated from heteroatomcontaining building blocks, J. Chem. Inf. Comput. Sci., 42 (2002) 8–22.

    Article  CAS  Google Scholar 

  18. Rarey, M. and Stahl, M., Similarity searching in large combinatorial chemistry spaces, J. Comput. Aided Mol. Des., 15 (2001) 497–520.

    Article  CAS  Google Scholar 

  19. Consonni, V., Todeschini, R. and Pavan, M., Structure/response correlations and similarity/diversity analysis by GETAWAY descriptors. 1. Theory of the novel 3-D molecular descriptors, J. Chem. Inf. Comput. Sci., 42 (2002) 682–692.

    Article  CAS  Google Scholar 

  20. Makara, G.M., Measuring molecular similarity and diversity: total pharmacophore diversity, J. Med. Chem., 44 (2001) 3563–3571.

    Article  CAS  Google Scholar 

  21. Goodford, P. J., A computational procedure for determining energetically favorable binding sites on biologically important macromolecules, J. Med. Chem., 28 (1985) 849–857.

    Article  CAS  Google Scholar 

  22. Cramer, R. D., Patterson, D. E. and Bunce, J. D., Comparative Molecular Field Analysis (CoMFA): 1. Effect of shape on binding of steroids to carrier proteins, J. Am. Chem. Soc., 110 (1988) 5959–5967.

    Article  CAS  Google Scholar 

  23. Cruciani, G. and Watson, K. A., Comparative molecular field analysis using GRID force-field and GOLPE variable selection methods in a study of inhibitors of glycogen phosphorylase b, J. Med. Chem., 37 (1994) 2589–2601.

    Article  CAS  Google Scholar 

  24. Cruciani, G., Pastor, M. and Guba, W., VolSurf: A new tool for the pharmacokinetic optimization of lead compounds, Eur. J. Pharm. Sci., 11 Suppl 2 (2000) S29–39.

    Article  CAS  Google Scholar 

  25. Crivori, P., Cruciani, G., Carrupt, P. A. and Testa, B., Predicting blood-brain barrier permeation from three-dimensional molecular structure, J. Med. Chem., 43 (2000) 2204–2216.

    Article  CAS  Google Scholar 

  26. Zamora, I., Oprea, T., Cruciani, G., Pastor, M. and Ungell, A. L., Surface descriptors for protein-ligand affinity prediction, J. Med. Chem., 46 (2003) 25–33.

    Article  CAS  Google Scholar 

  27. Pastor, M., Cruciani, G., McLay, I., Pickett, S. and Clementi, S., GRid-INdependent descriptors (GRIND): A.a novel class of alignment-independent three-dimensional molecular descriptors, J. Med. Chem., 43 (2000) 3233–3243.

    Article  CAS  Google Scholar 

  28. Benedetti, P., Mannhold, R., Cruciani, G. and Pastor, M., GBR compounds and mepyramines as cocaine abuse therapeutics: Chemometric studies on selectivity using grid independent descriptors (GRIND), J. Med. Chem., 45 (2002) 1577–1584.

    Article  CAS  Google Scholar 

  29. Afzelius, L., Masimirembwa, C. M., Karlen, A., Andersson, T. B. and Zamora, I., Discriminant and quantitative PLS analysis of competitive CYP2C9 inhibitors versus non-inhibitors using alignment independent GRIND descriptors, J. Comput. Aided Mol. Des., 16 (2002) 443–458.

    Article  CAS  Google Scholar 

  30. Cruciani, G., Pastor, M. and Mannhold, R., Suitability of molecular descriptors for database mining. A comparative analysis, J. Med. Chem., 45 (2002) 2685–2694.

    Article  CAS  Google Scholar 

  31. Oprea, T. I., Zamora, I. and Ungell, A. L., Pharmacokinetically based mapping device for chemical space navigation, J. Comb. Chem., 4 (2002) 258–266.

    Article  CAS  Google Scholar 

  32. Gasteiger, J., Rudolph, C. and Sadowski, J., Automatic generation of 3-D atomic coordinates for organic molecules, Tetrahedron Comp. Method., 3 (1990) 537–547.

    Article  CAS  Google Scholar 

  33. Giesen, D. J., Gu, M. Z., Cramer, C. J. and Truhlar, D. G., A Universal Organic Solvation Model, J. Org. Chem., 61 (1996) 8720–8721.

    Article  CAS  Google Scholar 

  34. AMSOL 6.5.2, Hawkins, G. D., Giesen, D. J., G. C., L., Chambers, C. C., Rossi, I., Storer, J. W., Rinaldi, D., Liotard, D. A., Cramer, C. J. and Truhlar, D. G., University of Minnesota, Minneapolis, 1997.

  35. VolSurf 3.0.7c, Cruciani, G., Pastor, M. and Mecucci, S., Molecular Discovery Ltd., Perugia, 2002.

  36. Almond 3.2.0, Cruciani, G., Fontaine, F. and Pastor, M., Molecular Discovery Ltd., Perugia, 2003.

  37. Carey, R. N., Wold, S. and Westgard, J. O., Principal component analysis: an alternative to 'referee' methods in method comparison studies, Anal. Chem., 47 (1975) 1824–1829.

    Article  CAS  Google Scholar 

  38. SPSS 11.0.1, SPSS inc. Chicago, 2001.

  39. Downs, G. M. and Barnard, J. M., Clustering methods and their uses in computational chemistry, In Lipkowitz, K. B. and Boyd, D. B. (eds.), Reviews in Computational Chemistry, Wiley-VCH, John Wiley & Sons, Inc., 2002, pp. 1–40.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ferran Sanz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fontaine, F., Pastor, M., Gutiérrez-de-Terán, H. et al. Use of alignment-free molecular descriptors in diversity analysis and optimal sampling of molecular libraries. Mol Divers 6, 135–147 (2003). https://doi.org/10.1023/B:MODI.0000006840.89805.e1

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:MODI.0000006840.89805.e1

Navigation