Selected Contributions in Data Analysis and Classification

Part of the series Studies in Classification, Data Analysis, and Knowledge Organization pp 433-444

Clustering of Molecules: Influence of the Similarity Measures

  • Samia AciAffiliated withCentre de Criblage pour Molécules Bioactives
  • , Gilles BissonAffiliated withLaboratoire TIMC-IMAG, CNRS / UJF 5525
  • , Sylvaine RoyAffiliated withLaboratoire Biologie, Informatique, Mathématiques, CEA-DSV-iRTSV
  • , Samuel WieczorekAffiliated withLaboratoire Biologie, Informatique, Mathématiques, CEA-DSV-iRTSV

* Final gross prices may vary according to local VAT.

Get Access


In this paper, we present the results of an experimental study to analyze the effect of various similarity (or distance) measures on the clustering quality of a set of molecules. We mainly focused on the clustering approaches able to directly deal with the 2D representation of the molecules (i.e., graphs). In such a context, we found that it seems relevant to use an approach based on asymmetrical measures of similarity. Our experiments are carried out on a dataset coming from the High Throughput Screening HTS domain.