Gapped Local Similarity Search with Provable Guarantees

  • Manikandan Narayanan
  • Richard M. Karp
Conference paper

DOI: 10.1007/978-3-540-30219-3_7

Part of the Lecture Notes in Computer Science book series (LNCS, volume 3240)
Cite this paper as:
Narayanan M., Karp R.M. (2004) Gapped Local Similarity Search with Provable Guarantees. In: Jonassen I., Kim J. (eds) Algorithms in Bioinformatics. WABI 2004. Lecture Notes in Computer Science, vol 3240. Springer, Berlin, Heidelberg

Abstract

We present a program qhash, based on q-gram filtration and high-dimensional search, to find gapped local similarities between two sequences. Our approach differs from past q-gram-based approaches in two main aspects. Our filtration step uses algorithms for a sparse all-pairs problem, while past studies use suffix-tree-like structures and counters. Our program works in sequence-sequence mode, while most past ones (except QUASAR) work in pattern-database mode.

We leverage existing research in high-dimensional proximity search to discuss sparse all-pairs algorithms, and show them to be subquadratic under certain reasonable input assumptions. Our qhash program has provable sensitivity (even on worst-case inputs) and average-case performance guarantees. It is significantly faster than a fully sensitive dynamic-programming-based program for strong similarity search on longsequences.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Manikandan Narayanan
    • 1
  • Richard M. Karp
    • 1
    • 2
  1. 1.Computer Science DivisionUniversity of CaliforniaBerkeleyUSA
  2. 2.International Computer Science InstituteBerkeleyUSA

Personalised recommendations