Accelerating Motif Discovery: Motif Matching on Parallel Hardware
Discovery of motifs in biological sequences is an important problem, and several computational methods have been developed to date. One of the main limitations of the established motif discovery methods is that the running time is prohibitive for very large data sets, such as upstream regions of large sets of cell-cycle regulated genes. Parallel versions have been developed for some of these methods, but this requires supercomputers or large computer clusters. Here, we propose and define an abstract module PAMM (Parallel Acceleration of Motif Matching) with motif matching on parallel hardware in mind. As a proof-of-concept, we provide a concrete implementation of our approach called MAMA. The implementation is based on the MEME algorithm, and uses an implementation of PAMM based on specialized hardware to accelerate motif matching. Running MAMA on a standard PC with specialized hardware on a single PCI-card compares favorably to running parallel MEME on a cluster of 12 computers.
KeywordsExpectation Maximization Motif Discovery Specialized Hardware Multiple Instruction Single Data Motif Position
Unable to display preview. Download preview PDF.
- 1.Bailey, T.L., Elkan, C.: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In: Proc. Int. Conf. Intell. Syst. Mol. Biol. ISMB 1994, pp. 28–36 (1994)Google Scholar
- 2.Grundy, W.N., Bailey, T.L., Elkan, C.P.: ParaMEME: a parallel implementation and a web interface for a DNA and protein motif discovery tool. Comput. Appl. Biosci. 12, 303–310 (1996)Google Scholar
- 5.Mak, T.S.T., Lam, K.P.: Embedded computation of maximum-likelihood phylogeny inference using platform FPGA. In: Proc. Comput. Systems Bioinformatics Conf. CSB 2004, pp. 512–514. IEEE, Los Alamitos (2004)Google Scholar
- 7.Marsan, L., Sagot, M.F.: Extracting structured motifs using a suffix tree-algorithms and application to promoter consensus identification. In: RECOMB 2000: Proceedings of the fourth annual international conference on Computational molecular biology, pp. 210–219. ACM Press, New York (2000)CrossRefGoogle Scholar