Toward Optimal Motif Enumeration

  • Patricia A. Evans
  • Andrew D. Smith
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2748)


We present algorithms that reduce the time and space needed to solve problems of finding all motifs common to a set of sequences. In particular, we give algorithms that (1) require time and space linear in the size of the input, (2) succinctly encode the output so that the time and space requirements depend on the number of motifs, not directly on motif length, and (3) efficiently parallelize the enumeration.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aho, A.: Algorithms for finding patterns in strings. In: van Leeuwen, J. (ed.) Handbook of Theoretical Computer Science, vol. A, pp. 257–300. MIT Press/Elsevier (1990)Google Scholar
  2. 2.
    Blanchette, M., Schwikowski, B., Tompa, M.: An exact algorithm to identify motifs in orthologous sequences from multiple species. In: Proceedings of the Annual International Conference on Computational Molecular Biology, pp. 37–45. ACM Press, New York (2000)Google Scholar
  3. 3.
    Cesati, M., Di Ianni, M.: Parameterized parallel complexity. Technical Report 4(6), Electronic Colloquium on Computational Complexity (ECCC) (1997) Google Scholar
  4. 4.
    Cook, S.A.: A taxonomy of problems with fast parallel algorithms. Information and Control 64(1-3), 2–21 (1985)zbMATHCrossRefMathSciNetGoogle Scholar
  5. 5.
    Downey, R., Fellows, M.: Parameterized Complexity. Monographs in Computer Science. Springer, New York (1999)Google Scholar
  6. 6.
    Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge University Press, Cambridge (1997)zbMATHCrossRefGoogle Scholar
  7. 7.
    van Helden, J., Andre, B., Collado-Vides, J.: Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. Journal of Molecular Biology 281, 827–842 (1998)CrossRefGoogle Scholar
  8. 8.
    McCreight, E.M.: A space-economical suffix tree construction algorithm. Journal of the Association for Computing Machiner 23(2), 262–272 (1976)zbMATHMathSciNetGoogle Scholar
  9. 9.
    Sagot, M.-F.: Spelling approximate repeated or common motifs using a suffix tree. In: Lucchesi, C.L., Moura, A.V. (eds.) LATIN 1998. LNCS, vol. 1390, pp. 374–390. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  10. 10.
    Sinha, S., Tompa, M.: A statistical method for finding transcription factor binding sites. In: Proceedings of the Annual International Symposium on Intelligent Systems for Molecular Biology, 2000, pp. 344–344. AAAI Press (2000)Google Scholar
  11. 11.
    Martin Tompa. An exact method for finding short motifs in sequences, with application to the ribosome binding site problem. In: Proceedings of the Annual International Symposium on Intelligent Systems for Molecular Biology, 1999. AAAI Press, pages 262–271. Google Scholar
  12. 12.
    Valiant, L.G.: A Bridging Model for Parallel Computation. Communications of the Association for Computing Machinery 33(8), 103–111 (1990)Google Scholar
  13. 13.
    Vanet, A., Marsan, L., Labigne, A., Sagot, M.-F.: Inferring regulatory elements from a whole genome. An application to the analysis of the genome of Helicobacter pylori σ 80 family of promoter signals. Journal of Molecular Biology 297, 335–353 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Patricia A. Evans
    • 1
  • Andrew D. Smith
    • 1
    • 2
  1. 1.University of New BrunswickFredericton N.B.Canada
  2. 2.Ontario Cancer InstituteUniversity Health Network, Suite 703TorontoCanada

Personalised recommendations