An Adaptive Reference Point Approach to Efficiently Search Large Chemical Databases

  • Francesco Napolitano
  • Roberto Tagliaferri
  • Pierre Baldi
Conference paper

DOI: 10.1007/978-3-319-04129-2_7

Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 26)
Cite this paper as:
Napolitano F., Tagliaferri R., Baldi P. (2014) An Adaptive Reference Point Approach to Efficiently Search Large Chemical Databases. In: Bassis S., Esposito A., Morabito F. (eds) Recent Advances of Neural Network Models and Applications. Smart Innovation, Systems and Technologies, vol 26. Springer, Cham

Abstract

The ability to rapidly search large repositories of molecules is a crucial task in chemoinformatics. In this work we propose AOR, an approach based on adaptive reference points to improve state of the art performances in querying large repositories of binary fingerprints basing on the Tanimoto distance. We propose a unifying view between the context of reference points and the previously proposed hashing techniques. We also provide a mathematical model to forecast and generalize the results, that is validated by simulating queries over an excerpt of the ChemDB. Clustering techniques are finally introduced to improve the performances. For typical situations the proposed algorithm is shown to resolve queries up to 4 times faster than compared methods.

Keywords

molecular fingerprits chemical database binary vector search 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Francesco Napolitano
    • 1
    • 2
  • Roberto Tagliaferri
    • 1
  • Pierre Baldi
    • 2
  1. 1.Department of InformaticsUniversity of SalernoFiscianoItaly
  2. 2.Institute for Genomics and Bioinformatics, School of Information and Computer SciencesUniversity of California-IrvineIrvineUSA

Personalised recommendations