Advertisement

Large Scale Protein Sequence Alignment Using FPGA Reprogrammable Logic Devices

  • Stefan Dydel
  • Piotr Bała
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3203)

Abstract

In this paper we show how to significantly accelerate Smith-Waterman protein sequence alignment algorithm using reprogrammable logic devices – FPGAs (Field Programmable Gate Array). Due to perfect sensitivity, the Smith-Waterman algorithm is important in a field of computational biology but computational complexity makes it impractical for large database searches when running on general purpose computers.

Current approach allows for aminoacid sequence alignment with full substitution matrix which leads to more complex formula than used in DNA alignment and is much more memory demanding. We propose different parellization scheme than commonly used systolic arrays, leading to full utilization of PUs (Processing Units), regardless of sequence length. FPGA based implementation of Smith-Waterman algorithm can accelerate sequence alignment on a Pentium desktop computer by two orders of magnitude comparing to standard OSEARCH program from FASTA package.

Keywords

Field Programmable Gate Array Systolic Array Edit Operation Longe Common Subsequence General Purpose Computer 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Yu, C.W., Kwong, K.H., Lee, K.H., Leong, P.H.W.: A Smith-Waterman Systolic Cell. In: Proceedings of the Tenth International Workshop on Field Programmable Logic and Applications (FPL 2003), Lisbon, pp. 375–384 (2003)Google Scholar
  2. 2.
    West, B., Chamberlain, R.D., Indeck, R., Zhang, Q.: An FPGA-based Search Engine for Unstructured Database. In: Proc. of 2nd Workshop on Application Specific Processors (December 2003)Google Scholar
  3. 3.
    Weaver, N., Markovskiy, Y., Patel, Y., Wawrzynek, J.: Post Placement C-slow Retiming for the Xilinx Virtex FPGA. In: 11th ACM Symposium of Field Programmable Gate Arrays, FPGA (2003)Google Scholar
  4. 4.
    Guccione, S.A., Keller, E.: Gene matching using JBits. In: Field-Programmable Logic and Applications, Reconfigurable Computing 12th International Conference, September 2-4, pp. 1168–1171 (2002)Google Scholar
  5. 5.
    Yamaguchi, Y., Maruyama, T., Konagaya, A.: High Speed Homology Search with FPGAs. In: Pacific Symposium on Biocomputing, vol. 7, pp. 271–282 (2002)Google Scholar
  6. 6.
    Rognes, T., Seeberg, E.: Six-fold speedup of Smith-Waterman sequence database searches using parallel processing on common microprocessors. Bioinformatics 16(8), 699–706 (2000)CrossRefGoogle Scholar
  7. 7.
    Lavenier, D.: Speeding up genome computations with a systolic accelerator. SIAM News 31(8) (October 1998)Google Scholar
  8. 8.
    Hirshber, J.D., Hughey, R., Karplus, K., Kestrel: A Programmable Array for Sequence Analysis. In: Proc. Int. Conf. Application-Specific Systems, Architectures, and Processors, August 19-21, pp. 25–35. IEEE CS, Los Alamitos (1996)Google Scholar
  9. 9.
    Lavenier, D.: SAMBA: Systolic Accelerators for Molecular Biological Applications, IRISA Report (PI-988) (March 1996)Google Scholar
  10. 10.
    Hoang, D.T.: Searching genetic databases on splash 2. In: Proceedings 1993 IEEE Workshop on Field-Programmable Custom Computing Machines, pp. 185–192 (1993)Google Scholar
  11. 11.
    Hoang, D.T.: FPGA Implementation of Systolic Sequence Alignment. In: International Workshop on Field Programmable Logic and Applications, Vienna, Austria, August 31-September 2 (1992)Google Scholar
  12. 12.
    Lipton, R.J., Lopresti, D.: A systolic array for rapid string comparison. In: Proceedings of the Chapel Hill Conference on VLSI, pp. 363–376 (1985)Google Scholar
  13. 13.
    Paracel, inc., http://www.paracel.com
  14. 14.
    Sencel’s search software, http://www.sencel.com
  15. 15.
    Celera genomics, inc., http://www.celera.com
  16. 16.
    Crochemore, M., Iliopoulos, C., Pinzon, Y., Reid, J.: A Fast and Practical Bit-Vector Algorithm for the Longest Common Subsequence Problem. Information Processing Letters 80(6), 279–285 (2001)zbMATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Smith, T.F., Waterman, M.S.: Identifcation of Common Molecular Subsequences. Journal of Molecular Biology 147(1), 195–197 (1981)CrossRefGoogle Scholar
  18. 18.
    Waterman, M.S.: Introduction to Computational Biology: Sequences, Maps and Genomes. Chapman and Hall, London (1995)Google Scholar
  19. 19.
    Pearson, W.R.: Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 11(3), 635–650 (1991)CrossRefGoogle Scholar
  20. 20.
    Pearson, W.R., Lipman, D.J.: Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85(8), 2444–2448 (1988)CrossRefGoogle Scholar
  21. 21.
    Pearson, W.R.: Rapid and sensitive sequence comparison with fastp and fasta. Methods in Enzymology 183, 63–98 (1990)CrossRefGoogle Scholar
  22. 22.
    Ma, B., Tromp, J., Li, M.: PatternHunter: Faster and More Sensitive Homology Search. Bioinformatics 18(3), 440–445 (2002)CrossRefGoogle Scholar
  23. 23.
    Hertz, G.Z., Stormo, G.D.: Identifing DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15(7/8), 563–577 (1999)CrossRefGoogle Scholar
  24. 24.
    Davidson, A.: A Fast Pruning Algorithm for Optimal Sequence Alignment. In: Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001), pp. 49–56. IEEE Comput. Soc., Los Alamitos (2001)CrossRefGoogle Scholar
  25. 25.
    Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proc. matl. Acad. Sci. USA 89, 10915–10919 (1992)CrossRefGoogle Scholar
  26. 26.
    Timelogic home page, http://www.timelogic.com
  27. 27.
    Xilinx home page, http://www.xilinx.com
  28. 28.
    Synplicity home page, http://www.synplicity.com
  29. 29.
    Opencores home page, http://www.opencores.org

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Stefan Dydel
    • 1
  • Piotr Bała
    • 1
  1. 1.Faculty of Mathematics and Computer ScienceN. Copernicus UniversityTorunPoland

Personalised recommendations