Bit-Parallel Multiple Pattern Matching

  • Tuan Tu Tran
  • Mathieu Giraud
  • Jean-Stéphane Varré
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7204)

Abstract

Text matching with errors is a regular task in computational biology. We present an extension of the bit-parallel Wu-Manber algorithm [16] to combine several searches for a pattern into a collection of fixed-length words. We further present an OpenCL parallelization of a redundant index on massively parallel multicore processors, within a framework of searching for similarities with seed-based heuristics. We successfully implemented and ran our algorithms on GPU and multicore CPU. Some speedups obtained are more than 60×.

Keywords

bit parallelism pattern matching sequence comparison neighborhood indexing GPU OpenCL 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–413 (1990)Google Scholar
  2. 2.
    Baeza-Yates, R.A., Gonnet, G.H.: A new approach to text searching. SIGIR Forum 23(3-4), 7 (1989)CrossRefGoogle Scholar
  3. 3.
    Brown, D.G.: A survey of seeding for sequence alignment. In: Bioinformatics Algorithms: Techniques and Applications, pp. 126–152 (2008)Google Scholar
  4. 4.
    Charalambous, M., Trancoso, P., Stamatakis, A.P.: Initial Experiences Porting a Bioinformatics Application to a Graphics Processor. In: Bozanis, P., Houstis, E.N. (eds.) PCI 2005. LNCS, vol. 3746, pp. 415–425. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  5. 5.
    Gummaraju, J., Morichetti, L., Houston, M., Sander, B., Gaster, B.R., Zheng, B.: Twin peaks: a software platform for heterogeneous computing on general-purpose and graphics processors. In: Parallel Architectures and Compilation Techniques (PACT 2010), pp. 205–216 (2010)Google Scholar
  6. 6.
    Gusfield, D.: Algorithms on Strings, Trees, and Sequences (1997)Google Scholar
  7. 7.
    Hennessy, J.L., Patterson, D.A.: Computer Architecture, A Quantitative Approach. Morgan Kaufmann (2006)Google Scholar
  8. 8.
    Hirashima, K., Bannai, H., Matsubara, W., Ishino, A., Shinohara, A.: Bit-parallel algorithms for computing all the runs in a string. In: Prague Stringology Conference, PSC 2009 (2009)Google Scholar
  9. 9.
    Ma, B., Tromp, J., Li, M.: PatternHunter: faster and more sensitive homology search. Bioinformatics 18(3), 440–445 (2002)CrossRefGoogle Scholar
  10. 10.
    Navarro, G.: NR-grep: a fast and flexible pattern-matching tool. Software: Practice and Experience 31(13), 1265–1312 (2001)MATHCrossRefGoogle Scholar
  11. 11.
    Navarro, G., Raffinot, M.: Flexible Pattern Matching in Strings – Practical on-line search algorithms for texts and biological sequences (2002)Google Scholar
  12. 12.
    Noé, L., Kucherov, G.: YASS: enhancing the sensitivity of DNA similarity search. Nucleic Acids Research 33(S2), W540–W543 (2005)CrossRefGoogle Scholar
  13. 13.
    Peterlongo, P., Noé, L., Lavenier, D., Nguyen, V.H., Kucherov, G., Giraud, M.: Optimal neighborhood indexing for protein similarity search. BMC Bioinformatics 9(534) (2008)Google Scholar
  14. 14.
    Pisanti, N., Giraud, M., Peterlongo, P.: Filters and Seeds Approaches for Fast Homology Searches in Large Datasets. Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications (2011)Google Scholar
  15. 15.
    Varré, J.-S., Schmidt, B., Janot, S., Giraud, M.: Manycore High-Performance Computing in Bioinformatics. In: Advances in Genomic Sequence Analysis and Pattern Discovery. World Scientific (2011)Google Scholar
  16. 16.
    Wu, S., Manber, U.: Fast Text Searching Allowing Errors. Communications of the ACM 35(10), 83–91 (1992)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Tuan Tu Tran
    • 1
    • 2
  • Mathieu Giraud
    • 1
    • 2
  • Jean-Stéphane Varré
    • 1
    • 2
  1. 1.LIFL, UMR 8022 CNRSUniversité Lille 1France
  2. 2.INRIA LilleVilleneuve d’AscqFrance

Personalised recommendations