New Bounds for Motif Finding in Strong Instances

  • Broňa Brejová
  • Daniel G. Brown
  • Ian M. Harrower
  • Tomáš Vinař
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4009)

Abstract

Many algorithms for motif finding that are commonly used in bioinformatics start by sampling r potential motif occurrences from n input sequences. The motif is derived from these samples and evaluated on all sequences. This approach works extremely well in practice, and is implemented by several programs. Li, Ma and Wang have shown that a simple algorithm of this sort is a polynomial-time approximation scheme. However, in 2005, we showed specific instances of the motif finding problem for which the approximation ratio of a slight variation of this scheme converges to one very slowly as a function of the sample size r, which seemingly contradicts the high performance of sample-based algorithms. Here, we account for the difference by showing that, for a variety of different definitions of “strong” binary motifs, the approximation ratio of sample-based algorithms converges to one exponentially fast in r. We also describe “very strong” motifs, for which the simple sample-based approach always identifies the correct motif, even for modest values of r.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Broňa Brejová
    • 1
  • Daniel G. Brown
    • 1
  • Ian M. Harrower
    • 1
  • Tomáš Vinař
    • 1
  1. 1.David R. Cheriton School of Computer ScienceUniversity of Waterloo 

Personalised recommendations