mBWA: A Massively Parallel Sequence Reads Aligner
Mapping sequenced reads to a reference genome, also known as sequence reads alignment, is central for sequence analysis. Emerging sequencing technologies such as next generation sequencing (NGS) lead to an explosion of sequencing data, which is far beyond the process capabilities of existing alignment tools. Consequently, sequence alignment becomes the bottleneck of sequence analysis. Intensive computing power is required to address this challenge. A key feature of sequence alignment is that different reads are independent. Considering this property, we proposed a multi-level parallelization strategy to speed up BWA, a widely used sequence alignment tool and developed our massively parallel sequence aligner: mBWA. mBWA contains two levels of parallelization: firstly, parallelization of data input/output (IO) and reads alignment by a three-stage parallel pipeline; secondly, parallelization enabled by Intel Many Integrated Core (MIC) coprocessor technology. In this paper, we demonstrate that mBWA outperforms BWA by a combination of those techniques. To the best of our knowledge, mBWA is the first sequence alignment tool to run on Intel MIC and it can achieve more than 5-fold speedup over the original BWA while maintaining the alignment precision.
KeywordsNGS sequence aligner BWA parallelization MIC coprocessor
Unable to display preview. Download preview PDF.
- 5.Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science 2000, pp. 390–398. IEEE (2000)Google Scholar
- 6.Homer, N., Merriman, B., Nelson, S.F.: BFAST: an alignment tool for large scale genome resequencing. PloS One 4(11), e7767 (2009)Google Scholar
- 7.Jeffers, J., Reinders, J.: Intel Xeon Phi Coprocessor High Performance Programming. Newnes, Boston (2013)Google Scholar
- 9.Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10(3), R25 (2009)Google Scholar
- 18.Medina-Medina, N., Broka, A., Lacey, S., Lin, H., Klings, E.S., Baldwin, C.T., Steinberg, M.H., Sebastiani, P.: Comparing Bowtie and BWA to Align Short Reads from a RNA-Seq Experiment. In: Rocha, M.P., Luscombe, N., Fdez-Riverola, F., Rodríguez, J.M.C. (eds.) 6th International Conference on PACBB. AISC, vol. 154, pp. 197–207. Springer, Heidelberg (2012)CrossRefGoogle Scholar
- 19.Pireddu, L., Leo, S., Zanetti, G.: MapReducing a genomic sequencing workflow. In: Proceedings of the Second International Workshop on MapReduce and its Applications, pp. 67–74. ACM (2011)Google Scholar
- 20.Rumble, S.M., Lacroute, P., Dalca, A.V., Fiume, M., Sidow, A., Brudno, M.: SHRiMP: accurate mapping of short color-space reads. PLoS Computational Biology 5(5), e1000386 (2009)Google Scholar