FPGA-Based Smith-Waterman Algorithm: Analysis and Novel Design

  • Yoshiki Yamaguchi
  • Hung Kuen Tsoi
  • Wayne Luk
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6578)


This paper analyses two methods of organizing parallelism for the Smith-Waterman algorithm, and show how they perform relative to peak performance when the amount of parallelism varies. A novel systolic design is introduced, with a processing element optimized for computing the affine gap cost function. Our FPGA design is significantly more energy-efficient than GPU designs. For example, our design for the XC5VLX330T FPGA achieves around 16 GCUPS/W, while CPUs and GPUs have a power efficiency of lower than 0.5 GCUPS/W.


Performance comparison dynamic programming 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic Local Alignment Search Tool. Molecular Biology 215(3), 403–410 (1990)CrossRefGoogle Scholar
  2. 2.
    Pearson, W.R.: Comparison of methods for searching protein sequence databases. Profein Science 4(6), 1145–1160 (1995)CrossRefGoogle Scholar
  3. 3.
    Shpaer, E.G., Robinson, M., Yee, D., Candlin, J.D., Mines, R., Hunkapiller, T.: Sensitivity and selectivity in protein similarity searches: A comparison of Smith-Waterman in hardware to BLAST and FASTA. Genomics 38, 179–191 (1996)CrossRefGoogle Scholar
  4. 4.
    Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970)CrossRefGoogle Scholar
  5. 5.
    Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981)CrossRefGoogle Scholar
  6. 6.
    Van Court, T., Herbordt, M.C.: Families of FPGA-based accelerators for approximate string matching. Microprocessors & Microsystems 31, 135–145 (2007)CrossRefGoogle Scholar
  7. 7.
    ALTERA. Implementation of the smith-waterman algorithm on a reconfigurable supercomputing platform (September 2007)Google Scholar
  8. 8.
    Benkrid, K., Liu, Y., Benkrid, A.: A highly parameterised and efficient FPGA-based skeleton for pairwise biological sequence alignment. IEEE Transactions on Very Large Scale Integration (VLSI Systems) 17(4), 561–570 (2009)CrossRefGoogle Scholar
  9. 9.
    Ligowski, Ł., Rudnicki, W.R.: An efficient implementation of smith waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases. In: Proceedings of the IEEE International Symposium on Parallel and Distributed Processing (appeared in HICOMB), pp. 1–8 (May 2009)Google Scholar
  10. 10.
    Liu, Y., Maskell, D.L., Schmidt, B.: CUDASW++: optimizing smith-waterman sequence database searches for CUDA-enabled graphics processing units. BMC Research Notes 2(1), 73–82 (2009)CrossRefGoogle Scholar
  11. 11.
    Ligowski, Ł., Rudnicki, W.R.: GPU-SW sequence alignment server. In: Proceedings of International Conference on Computational Science, pp. 1–10 (June 2010)Google Scholar
  12. 12.
    Dohi, K., Benkrid, K., Ling, C., Hamada, T., Shibata, Y.: Highly efficient mapping of the smith-waterman algorithm on CUDA-compatible GPUs. In: Proceedings of the IEEE International Conference on Application-specific Systems Architectures and Processors, pp. 29–36 (July 2010)Google Scholar
  13. 13.
    Aldinucci, M., Danelutto, M., Meneghin, M., Kilpatrick, P., Torquati, M.: Efficient streaming applications on multi-core with fastflow: the biosequence alignment test-bed. In: Proceedings of International Conference on Parallel Computing, pp. 273–280 (September 2009)Google Scholar
  14. 14.
    Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C.: A model of evolutionary change in proteins, vol. 5. National Biomedical Research Foundation (1978)Google Scholar
  15. 15.
    Altschul, S.F.: Amino acid substitution matrices from an information theoretic perspective. Journal of Molecular Biology 219(3), 555–565 (1991)CrossRefGoogle Scholar
  16. 16.
    Gotoh, O.: An improved algorithm for matching biological sequences. Journal of Molecular Biology 162(3), 705–708 (1982)CrossRefGoogle Scholar
  17. 17.
    Jacob, A.C., Buhler, J.D., Chamberlain, R.D.: Design of throughput-optimized arrays from recurrence abstractions. In: Proceedings of the IEEE International Conference on Application-specific Systems Architectures and Processors, pp. 133–140 (July 2010)Google Scholar
  18. 18.
    Manavski, S.A., Valle, G.: CUDA compatible GPU cards as efficient hardware accelerators for smith-waterman sequence alignment. BMC Bioinformatics 9(suppl. 2), S10 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Yoshiki Yamaguchi
    • 1
  • Hung Kuen Tsoi
    • 2
  • Wayne Luk
    • 2
  1. 1.Graduate School of Systems and Information EngineeringUniversity of TsukubaTsukubaJapan
  2. 2.Department of ComputingImperial College LondonLondonUnited Kingdom

Personalised recommendations