Skip to main content

Advertisement

Log in

SWIMM 2.0: Enhanced Smith–Waterman on Intel’s Multicore and Manycore Architectures Based on AVX-512 Vector Extensions

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

The well-known Smith–Waterman (SW) algorithm is the most commonly used method for local sequence alignments, but its acceptance is limited by the computational requirements for large protein databases. Although the acceleration of SW has already been studied on many parallel platforms, there are hardly any studies which take advantage of the latest Intel architectures based on AVX-512 vector extensions. This SIMD set is currently supported by Intel’s Knights Landing (KNL) accelerator and Intel’s Skylake (SKL) general purpose processors. In this paper, we present an SW version that is optimized for both architectures: the renowned SWIMM 2.0. The novelty of this vector instruction set requires the revision of previous programming and optimization techniques. SWIMM 2.0 is based on a massive multi-threading and SIMD exploitation. It is competitive in terms of performance compared with other state-of-the-art implementations, reaching 511 GCUPS on a single KNL node and 734 GCUPS on a server equipped with a dual SKL processor. Moreover, these successful performance rates make SWIMM 2.0 the most efficient energy footprint implementation in this study achieving 2.94 GCUPS/Watts on the SKL processor.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. SWIMM 2.0 is available at https://github.com/enzorucci/SWIMM2.0.

  2. SWIPE is available at public repository: https://github.com/torognes/swipe.

  3. Parasail is available at public repository: https://github.com/jeffdaily/parasail.

  4. libssa is available at public repository: https://github.com/RonnySoak/libssa.

  5. FASTA format description: http://blast.ncbi.nlm.nih.gov/blastcgihelp.shtml.

  6. Swiss-Prot: http://www.uniprot.org/downloads.

  7. Environmental NR: ftp://ftp.ncbi.nih.gov/blast/db/FASTA/env_nr.gz.

  8. TrEMBL: http://www.uniprot.org/downloads.

  9. SSE4.1 and AVX2 versions using the QP technique were excluded from the analysis to improve figure readability since we found that the SP scheme always achieved the best performance, as in previous works [14].

  10. We have discarded the comparison with the SWhybrid framework [15] because we detected inconsistent alignment results in most of the experiments.

  11. The SSE4.1 and AVX2 versions using the QP technique were excluded from the analysis to improve figure readability since we found that the SP scheme always achieved the best performance, as in previous works [14].

  12. Once again, we have discarded the comparison with the SWhybrid framework [15] because we detected inconsistent alignment results in most of the experiments.

References

  1. Bender, E.: Big data in biomedicine: 4 big questions. Nature 527, S19 (2015)

    Article  Google Scholar 

  2. Altschul, S.F., Madden, T.L., Schffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped Blast and PsiBlast: a new generation of protein database search programs. Nucleic Acids Res. 25(17), 3389 (1997)

    Article  Google Scholar 

  3. Pearson, W.R., Lipman, D.J.: Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85(8), 2444 (1988). https://doi.org/10.1073/pnas.85.8.2444

    Article  Google Scholar 

  4. Sæbø, P.E., Andersen, S.M., Myrseth, J., Laerdahl, J.K., Rognes, T.: PARALIGN: rapid and sensitive sequence similarity searches powered by parallel computing technology. Nucleic Acids Res. 33(Suppl 2), W535 (2005)

    Article  Google Scholar 

  5. Farrar, M.: Striped Smith–Waterman speeds database searches six time over other SIMD implementations. Bioinformatics 23(2), 156 (2007)

    Article  Google Scholar 

  6. Rucci, E., García, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matías, M.: State-of-the-Art in Smith–Waterman Protein Database Search on HPC Platforms, pp. 197–223. Springer, New York (2016). https://doi.org/10.1007/978-3-319-41279-5_6

    Google Scholar 

  7. Rognes, T.: Faster Smith–Waterman database searches with inter-sequence SIMD parallelisation. BMC Bioinform. 12(1), 221 (2011). https://doi.org/10.1186/1471-2105-12-221

    Article  Google Scholar 

  8. Frielingsdorf, J.T.: Improving optimal sequence alignments through a simd-accelerated library. Master’s thesis, University of Oslo (2015)

  9. Daily, J.: Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments. BMC Bioinform. 17, 81 (2016)

    Article  Google Scholar 

  10. Liu, Y., Schmidt, B., Maskell, D.L.: CUDASW++2.0: enhanced Smith–Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions. BMC Res. Notes 3(1), 1 (2010). https://doi.org/10.1186/1756-0500-3-93

    Article  Google Scholar 

  11. Liu, Y., Wirawan, A., Schmidt, B.: CUDASW++ 3.0: accelerating Smith–Waterman protein database search by coupling CPU and GPU SIMD instructions. BMC Bioinform. 14, 117 (2013)

    Article  Google Scholar 

  12. Liu, Y., Schmidt, B.: SWAPHI: Smith–Waterman protein database search on Xeon Phi coprocessors. In: 25th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2014) (2014)

  13. Lan, H., Liu, W., Schmidt, B., Wang, B.: Accelerating large-scale biological database search on Xeon Phi-based neo-heterogeneous architectures. in 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2015), pp. 503–510. https://doi.org/10.1109/BIBM.2015.7359735

  14. Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matas, M.: An energy-aware performance analysis of SWIMM: Smith–Waterman implementation on Intel’s Multicore and Manycore architectures. Concurr. Comput. Pract. Exp. 27(18), 5517 (2015). https://doi.org/10.1002/cpe.3598

    Article  Google Scholar 

  15. Lan, H., Liu, W., Liu, Y., Schmidt, B.: SWhybrid: a hybrid-parallel framework for large-scale protein sequence database search. In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (2017), pp. 42–51. https://doi.org/10.1109/IPDPS.2017.42

  16. Isa, M., Benkrid, K., Clayton, T., Ling, C., Erdogan, A.: An FPGA-based parameterised and scalable optimal solutions for pairwise biological sequence analysis. In: Adaptive Hardware and Systems (AHS), 2011 NASA/ESA Conference on (2011), pp. 344–351. https://doi.org/10.1109/AHS.2011.5963957

  17. Oliver, T.F., Schmidt, B., Maskell, D.L.: Reconfigurable architectures for bio-sequence database scanning on FPGAs. IEEE Trans. Circuits Syst. II Express Briefs 52(12), 851 (2005). https://doi.org/10.1109/TCSII.2005.853340

    Article  Google Scholar 

  18. Li, T.I., Shum, W., Truong, K.: 160-fold acceleration of the Smith–Waterman algorithm using a field programmable gate array (FPGA). BMC Bioinform. 8, I85 (2007)

    Article  Google Scholar 

  19. Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matas, M.: OSWALD: OpenCL Smith–Waterman algorithm on altera FPGA for large protein databases. J. High Perform. Comput. Appl, Int (2016). https://doi.org/10.1177/1094342016654215

    Google Scholar 

  20. Rucci, E., Garcia, C., Botella, G., De Giusti, A., Naiouf, M., Prieto-Matias, M.: First experiences accelerating Smith–Waterman on Intel’s Knights Landing processor. In: Ibrahim, S., Choo, K.K.R., Yan, Z., Pedrycz, W. (eds.) Algorithms and Architectures for Parallel Processing: 17th International Conference, ICA3PP 2017, Helsinki, Finland, August 21–23, 2017, Proceedings, pp. 569–579. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-65482-9_42

  21. Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147(1), 195 (1981)

    Article  Google Scholar 

  22. Gotoh, O.: An improved algorithm for matching biological sequences. J. Mol. Biol. 162, 705–708 (1981)

    Article  Google Scholar 

  23. Sodani, A., Gramunt, R., Corbal, J., Kim, H.S., Vinod, K., Chinthamani, S., Hutsell, S., Agarwal, R., Liu, Y.C.: Knights landing: second-generation Intel Xeon Phi product. IEEE Micro 36(2), 34 (2016). https://doi.org/10.1109/MM.2016.25

    Article  Google Scholar 

  24. Asai, R.: MCDRAM as High-Bandidth Memory (HBM) in Knights Landing Processors: Developer’s Guide (2016). https://goparallel.sourceforge.net/wp-content/uploads/2016/05/Colfax_KNL_MCDRAM_Guide.pdf

  25. Intel Corporation: Intel 64 and IA-32 Architectures Optimization Reference Manual (2017). https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf

  26. Rognes, T., Seeberg, E.: Six-fold speed-up of Smith–Waterman sequence database searches using parallel processing on common microprocessors. Bioinformatics 16(8), 699 (2000). https://doi.org/10.1093/bioinformatics/16.8.699

    Article  Google Scholar 

Download references

Acknowledgements

This work has been supported by the EU (FEDER) and the Spanish MINECO, under Grant TIN2015-65277-R and the CAPAP-H6 network (TIN2016-81840-REDT).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carlos Garcia Sanchez.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rucci, E., Garcia Sanchez, C., Botella Juan, G. et al. SWIMM 2.0: Enhanced Smith–Waterman on Intel’s Multicore and Manycore Architectures Based on AVX-512 Vector Extensions. Int J Parallel Prog 47, 296–316 (2019). https://doi.org/10.1007/s10766-018-0585-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-018-0585-7

Keywords

Navigation