Advertisement

mBWA: A Massively Parallel Sequence Reads Aligner

  • Yingbo Cui
  • Xiangke Liao
  • Xiaoqian Zhu
  • Bingqiang Wang
  • Shaoliang Peng
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 294)

Abstract

Mapping sequenced reads to a reference genome, also known as sequence reads alignment, is central for sequence analysis. Emerging sequencing technologies such as next generation sequencing (NGS) lead to an explosion of sequencing data, which is far beyond the process capabilities of existing alignment tools. Consequently, sequence alignment becomes the bottleneck of sequence analysis. Intensive computing power is required to address this challenge. A key feature of sequence alignment is that different reads are independent. Considering this property, we proposed a multi-level parallelization strategy to speed up BWA, a widely used sequence alignment tool and developed our massively parallel sequence aligner: mBWA. mBWA contains two levels of parallelization: firstly, parallelization of data input/output (IO) and reads alignment by a three-stage parallel pipeline; secondly, parallelization enabled by Intel Many Integrated Core (MIC) coprocessor technology. In this paper, we demonstrate that mBWA outperforms BWA by a combination of those techniques. To the best of our knowledge, mBWA is the first sequence alignment tool to run on Intel MIC and it can achieve more than 5-fold speedup over the original BWA while maintaining the alignment precision.

Keywords

NGS sequence aligner BWA parallelization MIC coprocessor 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. Journal of Molecular Biology 215(3), 403–410 (1990)CrossRefGoogle Scholar
  2. 2.
    Chen, Y., Souaiaia, T., Chen, T.: PerM: efficient mapping of short sequencing reads with periodic full sensitive spaced seeds. Bioinformatics 25(19), 2514–2521 (2009)CrossRefGoogle Scholar
  3. 3.
    Clement, N.L., Snell, Q., Clement, M.J., Hollenhorst, P.C., Purwar, J., Graves, B.J., Johnson, W.E.: The GNUMAP algorithm: unbiased probabilistic mapping of oligonucleotides from next-generation sequencing. Bioinformatics 26(1), 38–45 (2010)CrossRefGoogle Scholar
  4. 4.
    Cokus, S.J., Feng, S., Zhang, X., Chen, Z., Merriman, B., Haudenschild, C.D., Jacobsen, S.E.: Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature 452(7184), 215–219 (2008)CrossRefGoogle Scholar
  5. 5.
    Ferragina, P., Manzini, G.: Opportunistic data structures with applications. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science 2000, pp. 390–398. IEEE (2000)Google Scholar
  6. 6.
    Homer, N., Merriman, B., Nelson, S.F.: BFAST: an alignment tool for large scale genome resequencing. PloS One 4(11), e7767 (2009)Google Scholar
  7. 7.
    Jeffers, J., Reinders, J.: Intel Xeon Phi Coprocessor High Performance Programming. Newnes, Boston (2013)Google Scholar
  8. 8.
    Jiang, H., Wong, W.H.: SeqMap: mapping massive amount of oligonucleotides to the genome. Bioinformatics 24(20), 2395–2396 (2008)CrossRefGoogle Scholar
  9. 9.
    Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10(3), R25 (2009)Google Scholar
  10. 10.
    Li, H., Durbin, R.: Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25(14), 1754–1760 (2009)CrossRefGoogle Scholar
  11. 11.
    Li, H., Homer, N.: A survey of sequence alignment algorithms for next-generation sequencing. Briefings in Bioinformatics 11(5), 473–483 (2010)CrossRefGoogle Scholar
  12. 12.
    Li, H., Ruan, J., Durbin, R.: Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Research 18(11), 1851–1858 (2008)CrossRefGoogle Scholar
  13. 13.
    Li, R., Li, Y., Fang, X., Yang, H., Wang, J., Kristiansen, K., Wang, J.: SNP detection for massively parallel whole-genome resequencing. Genome Research 19(6), 1124–1132 (2009)CrossRefGoogle Scholar
  14. 14.
    Li, R., Li, Y., Kristiansen, K., Wang, J.: SOAP: short oligonucleotide alignment program. Bioinformatics 24(5), 713–714 (2008)CrossRefGoogle Scholar
  15. 15.
    Li, R., Yu, C., Li, Y., Lam, T.W., Yiu, S.M., Kristiansen, K., Wang, J.: SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25(15), 1966–1967 (2009)CrossRefGoogle Scholar
  16. 16.
    Lin, H., Zhang, Z., Zhang, M.Q., Ma, B., Li, M.: ZOOM! Zillions of oligos mapped. Bioinformatics 24(21), 2431–2437 (2008)CrossRefGoogle Scholar
  17. 17.
    Ma, B., Tromp, J., Li, M.: PatternHunter: faster and more sensitive homology search. Bioinformatics 18(3), 440–445 (2002)CrossRefGoogle Scholar
  18. 18.
    Medina-Medina, N., Broka, A., Lacey, S., Lin, H., Klings, E.S., Baldwin, C.T., Steinberg, M.H., Sebastiani, P.: Comparing Bowtie and BWA to Align Short Reads from a RNA-Seq Experiment. In: Rocha, M.P., Luscombe, N., Fdez-Riverola, F., Rodríguez, J.M.C. (eds.) 6th International Conference on PACBB. AISC, vol. 154, pp. 197–207. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  19. 19.
    Pireddu, L., Leo, S., Zanetti, G.: MapReducing a genomic sequencing workflow. In: Proceedings of the Second International Workshop on MapReduce and its Applications, pp. 67–74. ACM (2011)Google Scholar
  20. 20.
    Rumble, S.M., Lacroute, P., Dalca, A.V., Fiume, M., Sidow, A., Brudno, M.: SHRiMP: accurate mapping of short color-space reads. PLoS Computational Biology 5(5), e1000386 (2009)Google Scholar
  21. 21.
    Schatz, M.C.: CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics 25(11), 1363–1369 (2009)CrossRefGoogle Scholar
  22. 22.
    Smith, A.D., Xuan, Z., Zhang, M.Q.: Using quality scores and longer reads improves accuracy of Solexa read mapping. BMC Bioinformatics 9(1), 128 (2008)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Yingbo Cui
    • 1
  • Xiangke Liao
    • 1
  • Xiaoqian Zhu
    • 1
  • Bingqiang Wang
    • 2
  • Shaoliang Peng
    • 1
  1. 1.National University of Defense TechnologyChangshaChina
  2. 2.BGIShenzhenChina

Personalised recommendations