JEDA: Joint entropy diversity analysis. An information-theoretic method for choosing diverse and representative subsets from combinatorial libraries

Landon, Melissa R.; Schaus, Scott E.

doi:10.1007/s11030-006-9042-4

JEDA: Joint entropy diversity analysis. An information-theoretic method for choosing diverse and representative subsets from combinatorial libraries

Full–length Paper
Published: 21 September 2006

Volume 10, pages 333–339, (2006)
Cite this article

Molecular Diversity Aims and scope Submit manuscript

Melissa R. Landon^1,2 &
Scott E. Schaus^2,3

104 Accesses
11 Citations
Explore all metrics

Summary

The joint entropy-based diversity analysis (JEDA) program is a new method of selecting representative subsets of compounds from combinatorial libraries. Similar to other cell-based diversity analyses, a set of chemical descriptors is used to partition the chemical space of a library of compounds; however, unlike other metrics for choosing a compound from each partition, a Shannon-entropy based scoring function implemented in a probabilistic search algorithm determines a representative subset of compounds. This approach enables the selection of compounds that are not only diverse but that also represent the densities of chemical space occupied by the original chemical library. Additionally, JEDA permits the user to define the size of the subset that the chemist wishes to create so that restrictions on time and chemical reagents can be considered. Subsets created from a chemical library by JEDA are compared to subsets obtained using other partition-based diversity analyses, namely principal components analysis and median partitioning, on a combinatorial library derived from the Comprehensive Medical Chemistry Dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Principal Component Analysis as a Tool for Library Design: A Case Study Investigating Natural Products, Brand-Name Drugs, Natural Product-Like Libraries, and Drug-Like Libraries

Consensus Diversity Plots: a global diversity analysis of chemical libraries

Article Open access 10 November 2016

Mariana González-Medina, Fernando D. Prieto-Martínez, … José L. Medina-Franco

IMMAN: free software for information theory-based chemometric analysis

Article 26 January 2015

Ricardo W. Pino Urias, Stephen J. Barigye, … Facundo Perez-Gimenez

References

Kitchen, D.B., Stahura, F.L. and Bajorath, J., Computational techniques for diversity analysis and compound classification, Mini. Rev. Med. Chem., 4 (2004) 1029–1039.
PubMed CAS Google Scholar
Godden, J.W. Median Partitioning: A novel method for the selection of representative subsets from large compound pools J. Chem. Inf. Comput. Sci., 42 (2002) 885–893.
Article PubMed CAS Google Scholar
Glen, W.G., Dunn, W.J. and Scott, D. R., Principal components analysis and partial least squares regression, Tetrahedron Comput. Methodol., 2 (1989) 349–376.
Article Google Scholar
Bayley, M.J. and Willett, P., Binning schemes for partition-based compound selection, J. Mol. Graph Model., 17 (1999) 10–18.
Article PubMed CAS Google Scholar
Raymond, J.W., Blankley, C.J. and Willett, P., Comparison of chemical clustering methods using graph- and fingerprint-based similarity measures J. Mol. Graph Model., 21 (2003) 421–433.
Article PubMed CAS Google Scholar
MacCuish, J., Nicolaou, C. and MacCuish, N.E., Ties in proximity and clustering compounds J. Chem .Inf. Comput. Sci., 41 (2001) 134–146.
Article PubMed CAS Google Scholar
Shannon, C., A Mathematical Theory of Communication Bell System Technical J., 27 (1948) 623–656.
Google Scholar
Lin, S.K., Molecular diversity assessment: Logarithmic relations of information and species diversity and logarithmic relations of entropy and indistinguishability after rejection of Gibbs paradox of entropy mixing Molecules, 1 (1996) 57–67.
Article CAS Google Scholar
Agrafiotis, D.K., On the use of information theory for assessing molecular diversity J. Chem. Inf. Comput. Sci., 37 (1997) 576–580.
Article CAS Google Scholar
Godden, J.W. and Bajorath, J., Shannon entropy – a novel concept in molecular descriptor and diversity analysis J. Mol. Graph Model, 18 (2000) 73–76.
PubMed CAS Google Scholar
Godden, J.W., Stahura, F.L. and Bajorath, J., Variability of molecular descriptors in compound databases revealed by Shannon entropy calculations J. Chem. Inf. Comput. Sci., 40 (2000) 796–800.
Article PubMed CAS Google Scholar
Godden, J.W. and Bajorath, J., Differential Shannon entropy as a sensitive measure of differences in database variability of molecular descriptors, J. Chem. Inf. Comput. Sci., 41 (2001) 1060–1066.
Article PubMed CAS Google Scholar
Miller, J.L., Bradley, E.K. and Teig, S.L., Luddite: An information-theoretic library design tool J. Chem. Inf. Comput. Sci., 43 (2003) 47–54.
Article PubMed CAS Google Scholar
Xue, L., Godden, J.W. and Bajorath, J., Database searching for compounds with similar biological activity using short binary bit string representations of molecules J. Chem. Inf. Comput. Sci., 39 (1999) 881–886.
Article PubMed CAS Google Scholar
Xue, L., et al., Design and evaluation of a molecular fingerprint involving the transformation of property descriptor values into a binary classification scheme, J. Chem. Inf. Comput. Sci., 43 (2003) 1151–7.
Article PubMed CAS Google Scholar
Comprehensive Medicinal Chemistry, MDL Information Systems, Inc.: San Leandro, CA, 2004.
ChemFinder Ultra, Cambridgesoft, Cambridge, MA, 2001.
Molecular Operating Environment (MOE), Chemical Computing Group, Montreal, Quebec, 2004.
Labute, P., A widely applicable set of descriptors, J. Mol. Graph Model, 18 (2000) 464–477.
Article PubMed CAS Google Scholar

Download references

Author information

Authors and Affiliations

Graduate Program in Bioinformatics and Systems Biology, Boston, MA, 02215, U.S.A
Melissa R. Landon
Center for Chemical Methodology and Library Development, Boston, MA, 02215, U.S.A
Melissa R. Landon & Scott E. Schaus
Department of Chemistry, Boston University, Boston, MA, 02215, U.S.A
Scott E. Schaus

Authors

Melissa R. Landon
View author publications
You can also search for this author in PubMed Google Scholar
Scott E. Schaus
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Scott E. Schaus.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Landon, M.R., Schaus, S.E. JEDA: Joint entropy diversity analysis. An information-theoretic method for choosing diverse and representative subsets from combinatorial libraries. Mol Divers 10, 333–339 (2006). https://doi.org/10.1007/s11030-006-9042-4

Download citation

Received: 20 January 2006
Accepted: 27 March 2006
Published: 21 September 2006
Issue Date: August 2006
DOI: https://doi.org/10.1007/s11030-006-9042-4

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

JEDA: Joint entropy diversity analysis. An information-theoretic method for choosing diverse and representative subsets from combinatorial libraries

Summary

Access this article

Similar content being viewed by others

Principal Component Analysis as a Tool for Library Design: A Case Study Investigating Natural Products, Brand-Name Drugs, Natural Product-Like Libraries, and Drug-Like Libraries

Consensus Diversity Plots: a global diversity analysis of chemical libraries

IMMAN: free software for information theory-based chemometric analysis

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Key words

Navigation

JEDA: Joint entropy diversity analysis. An information-theoretic method for choosing diverse and representative subsets from combinatorial libraries

Summary

Access this article

Similar content being viewed by others

Principal Component Analysis as a Tool for Library Design: A Case Study Investigating Natural Products, Brand-Name Drugs, Natural Product-Like Libraries, and Drug-Like Libraries

Consensus Diversity Plots: a global diversity analysis of chemical libraries

IMMAN: free software for information theory-based chemometric analysis

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation