Abstract
Sequence alignment is one of the most popular application areas in bioinformatics. Nowadays, the exponential growth of biological sequence data becomes a severe problem if processed on standard general purpose PCs. Tackling this problem with large computing clusters is a widely accepted solution, although acquaintance and maintenance as well as space and energy requirements introduce significant costs. However, this chapter shows that this problem can be addressed by harnessing the high-performance computing platform RIVYERA, based on reconfigurable hardware (in particular FPGAs). The implementations of three examples of widely used applications in this area in bioinformatics are described: optimal sequence alignment with the Needleman–Wunsch and Smith–Waterman algorithm, protein database search with BLASTp, and short-read sequence alignment with a BWA-like algorithm. The results show a clear outperformance of standard PCs and GPU systems as well as energy savings of more than 90% compared to PC clusters, combined with the space requirements for one RIVYERA of only 3U–4U in a standard server rack.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
S.F. Altschul, W. Gish, W. Miller, E.W. Myers, D.J. Lipman, Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)
S.F. Altschul, T.L. Madden, A.A. Schäffer, J. Zhang, Z. Zhang, W. Miller, D.J. Lipman, Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)
M. Burrows, D.J. Wheeler, A block-sorting lossless data compression algorithm. Tech. rep., Digital Systems Research Center, Palo Alto, CA (1994)
CLCbio – High-Speed Smith–Waterman (2012), http://www.clcbio.com/index.php?id=1254. Accessed March 2012
CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows–Wheeler transform (2011), http://cushaw.sourceforge.net/. Accessed March 2012
M.S. Farrar, Optimizing Smith–Waterman for the cell broadband engine (2010), http://sites.google.com/site/farrarmichael/smith-watermanfortheibmcellbe. Accessed March 2012
P. Ferragina, G. Manzini, Opportunistic data structures with applications, in Proceedings of FOCS2000 (2000), IEEE Computer Society, Washington DC, USA, pp. 390–398
N. Homer, B. Merriman, S.F. Nelson, Bfast: an alignment tool for large scale genome resequencing. PLoS ONE 4(11), 12 (2009). http://www.ncbi.nlm.nih.gov/pubmed/19907642
A. Jacob, J. Lancaster, J. Buhler, B. Harris, R.D. Chamberlain, Mercury BLASTp: accelerating protein sequence alignment. ACM Trans. Reconfigurable Tech. Syst. 1, 9:1–9:44 (2008)
S. Kasap, K. Benkrid, Y. Liu, Design and implementation of an FPGA-based core for gapped BLAST sequence alignment with the two-hit method. Eng. Lett. 16, 443–452 (2008)
P. Klus, S. Lam, D. Lyberg, M. Cheung, G. Pullan, I. McFarlane, G. Yeo, B. Lam, Barracuda - a fast short read sequence aligner using graphics processing units. BMC Res. Notes 5(1), 27 (2012). doi:10.1186/1756-0500-5-27
S. Kumar, C. Paar, J. Pelzl, G. Pfeiffer, A. Rupp, M. Schimmler, How to break DES for € 8,980, in SHARCS2006, Cologne, Germany (2006)
B. Langmead, C. Trapnell, M. Pop, S. Salzberg, Ultrafast and memory-efficient alignment of short dna sequences to the human genome. Genome Biol. 10(3), R25 (2009). doi:10.1186/gb-2009-10-3-r25, http://genomebiology.com/2009/10/3/R25
H. Li, R. Durbin, Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics (Oxford, England) 25(14), 1754–1760 (2009). doi:10.1093/bioinformatics/btp324, http://dx.doi.org/10.1093/bioinformatics/btp324
H. Li, J. Ruan, R. Durbin, Mapping short dna sequencing reads and calling variants using mapping quality scores. Genome Res. 18(11), 1851–1858 (2008). doi:10.1101/gr.078212.108, http://dx.doi.org/10.1101/gr.078212.108
R. Li, Y. Li, K. Kristiansen, J. Wang, SOAP: short oligonucleotide alignment program. Bioinformatics (Oxford, England) 24(5), 713–714 (2008). doi:10.1093/bioinformatics/btn025, http://dx.doi.org/10.1093/bioinformatics/btn025
R. Li, C. Yu, Y. Li, T.W.W. Lam, S.M.M. Yiu, K. Kristiansen, J. Wang, SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics (Oxford, England) 25(15), 1966–1967 (2009). doi:10.1093/bioinformatics/btp336, http://dx.doi.org/10.1093/bioinformatics/btp336
W. Liu, B. Schmidt, W. Müller-Wittig, CUDA-BLASTP: accelerating BLASTP on CUDA-enabled graphics hardware. IEEE/ACM Trans. Comput. Biol. Bioinformatics 8, 1678–1684 (2011)
Y. Liu, B. Schmidt, D. Maskell, CUDASW\(++\)2.0: enhanced Smith–Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions. BMC Res. Notes 3(1), 93 + (2010). doi:10.1186/1756-0500-3-93
A. Mahram, M.C. Herbordt, Fast and accurate NCBI BLASTp: acceleration with multiphase FPGA-based prefiltering, in Proceedings of ICS’10 (2010), ACM, New York, USA, pp. 73–28
NCBI BLAST, http://blast.ncbi.nlm.nih.gov/Blast.cgi. Accessed March 2012
NCBI GenBank database, http://www.ncbi.nlm.nih.gov/genbank/. Accessed March 2012
NCBI RefSeq database, http://www.ncbi.nlm.nih.gov/RefSeq/. Accessed March 2012
S.B. Needleman, C.D. Wunsch, A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48(3), 443–453 (1970)
G. Pfeiffer, S. Baumgart, J. Schröder, M. Schimmler, A massively parallel architecture for bioinformatics, in ICCS2009. Lecture Notes in Computer Science, vol. 5544 (Springer, Berlin, 2009), pp. 994–1003
SciEngines GmbH, http://www.sciengines.com. Accessed March 2012
T.F. Smith, M.S. Waterman, Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)
C. Starke, V. Grossmann, L. Wienbrandt, M. Schimmler, An FPGA implementation of an investment strategy processor, in ICCS2012. Procedia Computer Science, vol. 9 (Elsevier, 2012), pp. 1880–1889
C. Starke, V. Grossmann, L. Wienbrandt, S. Koschnicke, J. Carstens, M. Schimmler, Optimizing investment strategies with the reconfigurable hardware platform RIVYERA. Int. J. Reconfigurable Comput. 2012, 10 (2012). doi:10.1155/2012/646984
Superfamily HMM library and genome assignments server, http://supfam.cs.bris.ac.uk/SUPERFAMILY/. Accessed March 2012
UniProt Knowledgebase, http://www.ebi.ac.uk/uniprot/. Accessed March 2012
L. Wienbrandt, S. Baumgart, J. Bissel, F. Schatz, M. Schimmler, Massively parallel FPGA-based implementation of BLASTp with the two-hit method, in ICCS2011. Procedia Computer Science, vol. 1 (Elsevier, 2011), pp. 1967–1976
L. Wienbrandt, D. Siebert, M. Schimmler, Improvement of BLASTp on the FPGA-based high-performance computer RIVYERA, in ISBRA2012. Lecture Notes in Bioinformatics, vol. 7292 (Springer, Berlin, Heidelberg, 2012), pp. 275–286
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Wienbrandt, L. (2013). Bioinformatics Applications on the FPGA-Based High-Performance Computer RIVYERA. In: Vanderbauwhede, W., Benkrid, K. (eds) High-Performance Computing Using FPGAs. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1791-0_3
Download citation
DOI: https://doi.org/10.1007/978-1-4614-1791-0_3
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-1790-3
Online ISBN: 978-1-4614-1791-0
eBook Packages: EngineeringEngineering (R0)