Full–length Paper

Molecular Diversity

, Volume 10, Issue 3, pp 333-339

First online:

JEDA: Joint entropy diversity analysis. An information-theoretic method for choosing diverse and representative subsets from combinatorial libraries

  • Melissa R. LandonAffiliated withGraduate Program in Bioinformatics and Systems BiologyCenter for Chemical Methodology and Library Development
  • , Scott E. SchausAffiliated withCenter for Chemical Methodology and Library DevelopmentDepartment of Chemistry, Boston University Email author 

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access


The joint entropy-based diversity analysis (JEDA) program is a new method of selecting representative subsets of compounds from combinatorial libraries. Similar to other cell-based diversity analyses, a set of chemical descriptors is used to partition the chemical space of a library of compounds; however, unlike other metrics for choosing a compound from each partition, a Shannon-entropy based scoring function implemented in a probabilistic search algorithm determines a representative subset of compounds. This approach enables the selection of compounds that are not only diverse but that also represent the densities of chemical space occupied by the original chemical library. Additionally, JEDA permits the user to define the size of the subset that the chemist wishes to create so that restrictions on time and chemical reagents can be considered. Subsets created from a chemical library by JEDA are compared to subsets obtained using other partition-based diversity analyses, namely principal components analysis and median partitioning, on a combinatorial library derived from the Comprehensive Medical Chemistry Dataset.

Key words

chemical diversity Shannon entropy representative subset selection