Optimization of SAMtools sorting using OpenMP tasks

Weeks, Nathan T.; Luecke, Glenn R.

doi:10.1007/s10586-017-0874-8

Optimization of SAMtools sorting using OpenMP tasks

Published: 26 April 2017

Volume 20, pages 1869–1880, (2017)
Cite this article

Cluster Computing Aims and scope Submit manuscript

Nathan T. Weeks¹ &
Glenn R. Luecke²

1098 Accesses
8 Citations
3 Altmetric
Explore all metrics

Abstract

SAMtools is a widely-used genomics application for post-processing high-throughput sequence alignment data. Such sequence alignment data are commonly sorted to make downstream analysis more efficient. However, this sorting process itself can be computationally- and I/O-intensive: high-throughput sequence alignment files in the de facto standard binary alignment/map (BAM) format can be many gigabytes in size, and may need to be decompressed before sorting and compressed afterwards. As a result, BAM-file sorting can be a bottleneck in genomics workflows. This paper describes a case study on the performance analysis and optimization of SAMtools for sorting large BAM files. OpenMP task parallelism and memory optimization techniques resulted in a speedup of 5.9X versus the upstream SAMtools 1.3.1 for an internal (in-memory) sort of 24.6 GiB of compressed BAM data (102.6 GiB uncompressed) with 32 processor cores, while a 1.98X speedup was achieved for an external (out-of-core) sort of a 271.4 GiB BAM file.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Performance Analysis and Optimization of SAMtools Sorting

Parallel Partition and Merge QuickSort (PPMQSort) on Multicore CPUs

Article 18 February 2016

Ratthaslip Ranokphanuwat & Surin Kittitornkun

K-mulus: Strategies for BLAST in the Cloud

Notes

Source code for SAMtools optimizations available at https://doi.org/10.5281/zenodo.262169, and HTSlib optimizations at https://doi.org/10.5281/zenodo.262161
https://github.com/samtools/htslib/pull/51
http://www.htslib.org/benchmarks/zlib.html
https://github.com/samtools/htslib/pull/397
https://github.com/smowton/htslib/compare/parallel_read
Note that taskyield is a no-op as of gcc 6.2.0
This check could occur after task generation and before returning from the routine; however, the implementation did not consistently perform as well in practice, possibly due to an undetermined effect on task scheduling.

References

Adhianto, L., Banerjee, S., Fagan, M., Krentel, M., Marin, G., Mellor-Crummey, J., Tallent, N.R.: HPCTOOLKIT: tools for performance analysis of optimized parallel programs. Concurr. Comput. Pract. Exp. 22(6), 685–701 (2010). doi:10.1002/cpe.1553
Google Scholar
Bhimji, W., Bard, D., Romanus, M., Paul, D., Ovsyannikov, A., Friesen, B., Bryson, M., Correa, J., Lockwood, G.K., Tsulaia, V., et al.: Accelerating science with the NERSC burst buffer early user program. In: 2016 Cray User Group (CUG 2016) (2016). https://cug.org/proceedings/cug2016_proceedings/includes/files/pap162.pdf
Bonfield, J.K.: The Scramble conversion tool. Bioinformatics 30(19), 2818–2819 (2014). doi:10.1093/bioinformatics/btu390
Article Google Scholar
Consortium TGP: Nature A global reference for human genetic variation. 526(7571), 68–74 (2015). doi:10.1038/nature15393
Declerck, T., Antypas, K., Bard, D, Bhimji, W., Canon, S., Cholia, S., He, H.Y., Jacobsen, D., Prabhat, N.J.W.: Cori-A system to support data-intensive computing. In: 2016 Cray User Group (CUG 2016) (2016). https://cug.org/proceedings/cug2016_proceedings/includes/files/pap171.pdf
Diekmann, R., Gehring, J., Luling, R., Monien, B., Nubel, M., Wanka, R.: Sorting large data sets on a massively parallel system. In: Proceedings of 1994 6th IEEE Symposium on Parallel and Distributed Processing, pp. 2–9 (1994). 10.1109/SPDP.1994.346188
Faust, G.G., Hall, I.M.: SAMBLASTER: fast duplicate marking and structural variant read extraction. Bioinformatics 30(17), 2503–2505 (2014). doi:10.1093/bioinformatics/btu314
Article Google Scholar
Herzeel, C., Costanza, P., Decap, D., Fostier, J., Reumers, J.: elPrep: high-performance preparation of sequence alignment/map files for variant calling. PLoS ONE 10(7), 1–16 (2015). doi:10.1371/journal.pone.0132868
Article Google Scholar
Intel Corporation: Programming Intel QuickAssist Technology Hardware Accelerators for Optimal Performance. Technical reports (2015). https://01.org/sites/default/files/page/332125_002_0.pdf
Kelly, B.J., Fitch, J.R., Hu, Y., Corsmeier, D.J., Zhong, H., Wetzel, A.N., Nordquist, R.D., Newsom, D.L., White, P.: Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics. Genome Biol. 16(1), 6 (2015). doi:10.1186/s13059-014-0577-x
Article Google Scholar
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., Subgroup, G.P.D.P.: The sequence alignment/map format and SAMtools. Bioinformatics 25(16), 2078–2079 (2009). doi:10.1093/bioinformatics/btp352
Article Google Scholar
Lin, M.: Faster BAM sorting with SAMtools and RocksDB (2014). http://devblog.dnanexus.com/faster-bam-sorting-with-samtools-and-rocksdb/
Mellor-Crummey, J.M., Scott, M.L.: Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst. 9(1), 21–65 (1991). doi:10.1145/103727.103729
Article Google Scholar
OpenMP Architecture Review Board (2013) OpenMP Application Program Interface, Version 4.0. http://www.openmp.org/resources/openmp-compilers/
Picard. https://broadinstitute.github.io/picard/
Puckelwartz, M.J., Pesce, L.L., Nelakuditi, V., Dellefave-Castillo, L., Golbus, J.R., Day, S.M., Cappola, T.P., Dorn II, G.W., Foster, I.T., McNally, E.M.: Supercomputing for the parallelization of whole genome analysis. Bioinformatics 30(11), 1508 (2014). doi:10.1093/bioinformatics/btu071
Article Google Scholar
Raczy, C., Petrovski, R., Saunders, C.T., Chorny, I., Kruglyak, S., Margulies, E.H., Chuang, H.Y., Kllberg, M., Kumar, S.A., Liao, A., Little, K.M., Strmberg, M.P., Tanner, S.W.: Isaac: ultra-fast whole-genome secondary analysis on illumina sequencing platforms. Bioinformatics 29(16), 2041 (2013). doi:10.1093/bioinformatics/btt314
Article Google Scholar
Rengasamy, V., Madduri, K.: SPRITE: a fast parallel SNP detection pipeline, pp. 159–177. Springer, Cham (2016). doi:10.1007/978-3-319-41321-1_9
Google Scholar
Stephens, Z.D., Lee, S.Y., Faghri, F., Campbell, R.H., Zhai, C., Efron, M.J., Iyer, R., Schatz, M.C., Sinha, S., Robinson, G.E.: Big data: astronomical or genomical? PLoS Biol. 13(7), 1–11 (2015). doi:10.1371/journal.pbio.1002195
Article Google Scholar
Tarasov, A., Vilella, A.J., Cuppen, E., Nijman, I.J., Prins, P.: Sambamba: fast processing of NGS alignment formats. Bioinformatics 31(12), 2032–2034 (2015). doi:10.1093/bioinformatics/btv098
Article Google Scholar
Tischler, G.: biobambam2 (2017). https://github.com/gt1/biobambam2
Weeks, N.T., Luecke, G.R.: Performance analysis and optimization of SAMtools sorting. In: 4th International Workshop on Parallelism in Bioinformatics (PBio2016) (in press)
Wetterstrand, K.: DNA Sequencing costs: data from the NHGRI genome sequencing program (GSP) (2016). http://www.genome.gov/sequencingcostsdata

Download references

Acknowledgements

This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.

Author information

Authors and Affiliations

Department of Computer Science, Iowa State University, Ames, IA, USA
Nathan T. Weeks
Department of Mathematics, Iowa State University, Ames, IA, USA
Glenn R. Luecke

Authors

Nathan T. Weeks
View author publications
You can also search for this author in PubMed Google Scholar
Glenn R. Luecke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nathan T. Weeks.

Electronic Supplementary Material

Appendix (PDF 26KB)

Appendix (PDF 22 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Weeks, N.T., Luecke, G.R. Optimization of SAMtools sorting using OpenMP tasks. Cluster Comput 20, 1869–1880 (2017). https://doi.org/10.1007/s10586-017-0874-8

Download citation

Received: 10 December 2016
Revised: 27 February 2017
Accepted: 17 April 2017
Published: 26 April 2017
Issue Date: September 2017
DOI: https://doi.org/10.1007/s10586-017-0874-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Optimization of SAMtools sorting using OpenMP tasks

Abstract

Access this article

Similar content being viewed by others

Performance Analysis and Optimization of SAMtools Sorting

Parallel Partition and Merge QuickSort (PPMQSort) on Multicore CPUs

K-mulus: Strategies for BLAST in the Cloud

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

Appendix (PDF 26KB)

Appendix (PDF 22 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimization of SAMtools sorting using OpenMP tasks

Abstract

Access this article

Similar content being viewed by others

Performance Analysis and Optimization of SAMtools Sorting

Parallel Partition and Merge QuickSort (PPMQSort) on Multicore CPUs

K-mulus: Strategies for BLAST in the Cloud

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic Supplementary Material

Appendix (PDF 26KB)

Appendix (PDF 22 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation