International Conference on Parallel Processing and Applied Mathematics

PPAM 2007: Parallel Processing and Applied Mathematics pp 1240-1248

Protein Similarity Search with Subset Seeds on a Dedicated Reconfigurable Hardware

  • Pierre Peterlongo
  • Laurent Noé
  • Dominique Lavenier
  • Gilles Georges
  • Julien Jacques
  • Gregory Kucherov
  • Mathieu Giraud
Conference paper

DOI: 10.1007/978-3-540-68111-3_131

Volume 4967 of the book series Lecture Notes in Computer Science (LNCS)
Cite this paper as:
Peterlongo P. et al. (2008) Protein Similarity Search with Subset Seeds on a Dedicated Reconfigurable Hardware. In: Wyrzykowski R., Dongarra J., Karczewski K., Wasniewski J. (eds) Parallel Processing and Applied Mathematics. PPAM 2007. Lecture Notes in Computer Science, vol 4967. Springer, Berlin, Heidelberg

Abstract

With a sharp increase of available DNA and protein sequence data, new precise and fast similarity search methods are needed for large-scale genome and proteome comparisons. Modern seed-based techniques of similarity search (spaced seeds, multiple seeds, subset seeds) provide a better sensitivity/specificity ratio. We present an implementation of such a seed-based technique on a parallel specialized hardware embedding reconfigurable architecture (FPGA), where the FPGA is tightly connected to large capacity Flash memories. This parallel system allows large databases to be fully indexed and rapidly accessed. Compared to traditional approaches presented by the Blastp software, we obtain both a significant speed-up and better results. To the best of our knowledge, this is the first attempt to exploit efficient seed-based algorithms for parallelizing the sequence similarity search.

Keywords

sequence similarity search spaced seeds subset seeds indexing FPGA reconfigurable architecture dedicated hardware 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Pierre Peterlongo
    • 1
  • Laurent Noé
    • 2
  • Dominique Lavenier
    • 1
  • Gilles Georges
    • 1
  • Julien Jacques
    • 1
  • Gregory Kucherov
    • 2
  • Mathieu Giraud
    • 2
  1. 1.Symbiose, IRISA, INRIA, CNRS, Université Rennes 1 
  2. 2.Sequoia/Bioinfo, LIFL, INRIA, CNRS, Université Lille 1