Sharper Upper and Lower Bounds for an Approximation Scheme for Consensus-Pattern

  • Broňa Brejová
  • Daniel G. Brown
  • Ian M. Harrower
  • Alejandro López-Ortiz
  • Tomáš Vinař
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3537)

Abstract

We present sharper upper and lower bounds for a known polynomial-time approximation scheme due to Li, Ma and Wang [7] for the Consensus-Pattern problem. This NP-hard problem is an abstraction of motif finding, a common bioinformatics discovery task. The PTAS due to Li et al. is simple, and a preliminary implementation [8] gave reasonable results in practice. However, the previously known bounds on its performance are useless when runtimes are actually manageable. Here, we present much sharper lower and upper bounds on the performance of this algorithm that partially explain why its behavior is so much better in practice than what was previously predicted in theory. We also give specific examples of instances of the problem for which the PTAS performs poorly in practice, and show that the asymptotic performance bound given in the original proof matches the behaviour of a simple variant of the algorithm on a particularly bad instance of the problem.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bailey, T.L., Elkan, C.: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In: Proceedings of the 2nd International Conference on Intelligent Systems for Molecular Biology (ISMB 1994), pp. 28–36. AAAI Press, Menlo Park (1994)Google Scholar
  2. 2.
    Buhler, J., Tompa, M.: Finding motifs using random projections. In: Proceedings of the 5th Annual International Conference on Computational Molecular Biology (RECOMB 2001), pp. 69–76 (2001)Google Scholar
  3. 3.
    Hertz, G.Z., Stormo, G.D.: Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15(7-8), 563–577 (1999)CrossRefGoogle Scholar
  4. 4.
    Keich, U., Pevzner, P.A.: Finding motifs in the twilight zone. Bioinformatics 18, 1374–1381 (2002)CrossRefGoogle Scholar
  5. 5.
    Keich, U., Pevzner, P.A.: Subtle motifs: defining the limits of motif finding algorithms. Bioinformatics 18, 1382–1390 (2002)CrossRefGoogle Scholar
  6. 6.
    Lawrence, C.E., Altschul, S.F., Boguski, M.S., Liu, J.S., Neuwald, A.F., Wootton, J.C.: Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262(5131), 208–214 (1993)CrossRefGoogle Scholar
  7. 7.
    Li, M., Ma, B., Wang, L.: Finding similar regions in many strings. Journal of Computer and System Sciences 65(1), 73–96 (2002)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Liang, C.: COPIA: A New Software for Finding Consensus Patterns in Unaligned Protein Sequences. Master’s thesis, University of Waterloo (October 2001)Google Scholar
  9. 9.
    Liu, J.: A Combinatorial Approach for Motif Discovery in Unaligned DNA Sequences. Master’s thesis, University of Waterloo (March 2004)Google Scholar
  10. 10.
    Pevzner, P.A., Sze, S.: Combinatorial approaches to finding subtle signals in DNA sequences. In: Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB 2000), pp. 269–278 (2000)Google Scholar
  11. 11.
    Thompson, M.E.: Theory of Sample Surveys. Chapman and Hall, Boca Raton (1997)MATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Broňa Brejová
    • 1
  • Daniel G. Brown
    • 1
  • Ian M. Harrower
    • 1
  • Alejandro López-Ortiz
    • 1
  • Tomáš Vinař
    • 1
  1. 1.School of Computer ScienceUniversity of Waterloo 

Personalised recommendations