Advertisement

GPU-MEME: Using Graphics Hardware to Accelerate Motif Finding in DNA Sequences

  • Chen Chen
  • Bertil Schmidt
  • Liu Weiguo
  • Wolfgang Müller-Wittig
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5265)

Abstract

Discovery of motifs that are repeated in groups of biological sequences is a major task in bioinformatics. Iterative methods such as expectation maximization (EM) are used as a common approach to find such patterns. However, corresponding algorithms are highly compute-intensive due to the small size and degenerate nature of biological motifs. Runtime requirements are likely to become even more severe due to the rapid growth of available gene transcription data. In this paper we present a novel approach to accelerate motif discovery based on commodity graphics hardware (GPUs). To derive an efficient mapping onto this type of architecture, we have formulated the compute-intensive parts of the popular MEME tool as streaming algorithms. Our experimental results show that a single GPU allows speedups of one order of magnitude with respect to the sequential MEME implementation. Furthermore, parallelization on a GPU-cluster even improves the speedup to two orders of magnitude.

References

  1. 1.
    Bailey, T.L., Elkan, C.: Unsupervised learning of multiple motifs in biopolymers using expectation maximization. Machine Learning 21, 51–80 (1995)Google Scholar
  2. 2.
    Bailey, T.L., Williams, N., Misleh, C., Li, W.W.: MEME: discovering and analyzing DNA and protein motifs. Nucleic Acid Research 34, W369–W373 (2006)CrossRefGoogle Scholar
  3. 3.
    Grundy, W.N., Bailey, T.L., Elkan, C.P.: ParaMEME: A parallel implementation and a web interface for a DNA and protein motif discovery tool. Computer Applications in the Biological Sciences (CABIOS) 12, 303–310 (1996)Google Scholar
  4. 4.
    Kessenich, J., Baldwin, D., Rost, R.: The OpenGL Shading Language, Document Revision 8 (2006), http://www.opengl.org/documentation/glsl/
  5. 5.
    Lawrence, C., Altschul, S., Boguski, M., Liu, J., Neuwald, A., Wootton, J.: Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262, 208–214 (1993)CrossRefPubMedGoogle Scholar
  6. 6.
    Liu, W., Schmidt, B., Voss, G., Muller-Wittig, W.: Streaming Algorithms for Biological Sequence Alignment on GPUs. IEEE Transactions on Parallel and Distributed Systems 18(10), 1270–1281 (2007)Google Scholar
  7. 7.
    Manavski, S.A., Valle, G.: CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment. BMC Bioinformatics 9(Suppl. 2), S10 (2008)CrossRefGoogle Scholar
  8. 8.
    Sabatti, C., Rohlin, L., Lange, K., Liao, J.C.: Vocabulon: a dictionary model approach for reconstruction and localization of transcription factor binding sites. Bioinformatics 21(7), 922–931 (2005)CrossRefPubMedGoogle Scholar
  9. 9.
    Sandve, G.K., Nedland, M., Syrstad, B., Eidsheim, L.A., Abul, O., Drablas, F.: Accelerating motif discovery: Motif matching on parallel hardware. In: Bücher, P., Moret, B.M.E. (eds.) WABI 2006. LNCS (LNBI), vol. 4175, pp. 197–206. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Schatz, M.C., Trapnell, C., Delcher, A.L., Varshney, A.: High-throughput sequence alignment using Graphics Processing Units. BMC Bioinformatics 8(474) (2007)Google Scholar
  11. 11.
    Sumazin, P., et al.: DWE: Discriminating Word Enumerator. Bioinformatics 21(1), 31038 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Chen Chen
    • 1
  • Bertil Schmidt
    • 1
  • Liu Weiguo
    • 1
  • Wolfgang Müller-Wittig
    • 1
  1. 1.School of Computer EngineeringNanyang Technological UniversitySingapore

Personalised recommendations