Resolving Load Balancing Issues in BWA on NUMA Multicore Architectures

  • Charlotte HerzeelEmail author
  • Thomas J. Ashby
  • Pascal Costanza
  • Wolfgang De Meuter
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8385)


Running BWA in multithreaded mode on a multi-socket server results in poor scaling behaviour. This is because the current parallelisation strategy does not take into account the load imbalance that is inherent to the properties of the data being aligned, e.g. varying read lengths and numbers of mutations. Additional load imbalance is also caused by the BWA code not anticipating certain hardware characteristics of multi-socket multicores, such as the non-uniform memory access time of the different cores. We show that rewriting the parallel section using Cilk removes the load imbalance, resulting in a factor two performance improvement over the original BWA.


BWA Multithreading NUMA Load balancing Cilk 



This work is funded by Intel, Janssen Pharmaceutica and by the Institute for the Promotion of Innovation through Science and Technology in Flanders (IWT).


  1. 1.
    Li, H., Durbin, R.: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14), 1754–1760 (2009)CrossRefGoogle Scholar
  2. 2.
    Burrows-Wheeler Aligner.
  3. 3.
    Leiserson, C.E.: The Cilk++ concurrency platform. J. Supercomput. 51(3), 244–257 (2010). (Kluwer Academic Publishers)CrossRefGoogle Scholar
  4. 4.
  5. 5.
    Farragina, P., Manzini, G.: Opportunistic data structures with applications. In: 41st IEEE Annual Symposium on Foundations of Computer Science, pp. 390–398. IEEE Computer Society, Los Alamitos (2000)Google Scholar
  6. 6.
    Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25:1–R25:10 (2009). (Article: R25)Google Scholar
  7. 7.
    Li, R., Yu, C., et al.: SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25(15), 1966–1967 (2009)CrossRefGoogle Scholar
  8. 8.
    Genomes Project.
  9. 9.
    Peters, D., Luo, X., Qiu, K., Liang, P.: Speeding up large-scale next generation sequencing data analysis with pBWA. J. Appl. Bioinform. Comput. Biol. 1(1), 1–6 (2012)Google Scholar
  10. 10.
    Herzeel, C., Costanza, P., Ashby, T., Wuyts, R.: Performance analysis of BWA alignment. Technical report, ExaScience Life Lab (2013)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Charlotte Herzeel
    • 1
    • 4
    Email author
  • Thomas J. Ashby
    • 1
    • 4
  • Pascal Costanza
    • 3
    • 4
  • Wolfgang De Meuter
    • 2
  1. 1.imecLeuvenBelgium
  2. 2.Software Languages LabVrije Universiteit BrusselBrusselBelgium
  3. 3.IntelKontichBelgium
  4. 4.ExaScience Life LabLeuvenBelgium

Personalised recommendations