Exploiting Multi-grain Parallelism for Efficient Selective Sweep Detection

  • Nikolaos Alachiotis
  • Pavlos Pavlidis
  • Alexandros Stamatakis
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7439)

Abstract

Selective sweep detection localizes targets of recent and strong positive selection by analyzing single nucleotide polymorphisms (SNPs) in intra-species multiple sequence alignments. Substantial advances in wet-lab sequencing technologies currently allow for generating unprecedented amounts of molecular data. The increasing number of sequences and number of SNPs in such large multiple sequence alignments cause prohibiting long execution times for population genetics data analyses that rely on selective sweep theory. To alleviate this problem, we have recently implemented fine- and coarse-grain parallel versions of our open-source tool OmegaPlus for selective sweep detection that is based on the ω statistic. A performance issue with the coarse-grain parallelization is that individual coarse-grain tasks exhibit significant run-time differences, and hence cause load imbalance. Here, we introduce a significantly improved multi-grain parallelization scheme which outperforms both the fine-grain as well as the coarse-grain versions of OmegaPlus with respect to parallel efficiency. The multi-grain approach exploits both coarse-grain and fine-grain operations by using available threads/cores that have completed their coarse-grain tasks to accelerate the slowest task by means of fine-grain parallelism. A performance assessment on real-world and simulated datasets showed that the multi-grain version is up to 39% and 64.4% faster than the coarse-grain and the fine-grain versions, respectively, when the same number of threads is used.

Keywords

Simulated Dataset Temporary Worker Selective Sweep Parallelization Scheme Cluster Workshop 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Maynard Smith, J., Haigh, J.: The hitch-hiking effect of a favourable gene. Genet. Res. 23(1), 23–35 (1974)CrossRefGoogle Scholar
  2. 2.
    Kim, Y., Nielsen, R.: Linkage disequilibrium as a signature of selective sweeps. Genetics 167(3), 1513–1524 (2004)CrossRefGoogle Scholar
  3. 3.
    Jensen, J.D., Thornton, K.R., Bustamante, C.D., Aquadro, C.F.: On the utility of linkage disequilibrium as a statistic for identifying targets of positive selection in nonequilibrium populations. Genetics 176(4), 2371–2379 (2007)CrossRefGoogle Scholar
  4. 4.
    Pavlidis, P., Jensen, J.D., Stephan, W.: Searching for footprints of positive selection in whole-genome snp data from nonequilibrium populations. Genetics 185(3), 907–922 (2010)CrossRefGoogle Scholar
  5. 5.
    Berger, S.A., Stamatakis, A.: Assessment of barrier implementations for fine-grain parallel regions on current multi-core architectures. In: Proc. IEEE Int Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS) Conf., pp. 1–8 (2010)Google Scholar
  6. 6.
    Stamatakis, A., Komornik, Z., Berger, S.A.: Evolutionary placement of short sequence reads on multi-core architectures. In: Proceedings of the ACS/IEEE International Conference on Computer Systems and Applications (AICCSA 2010), pp. 1–8. IEEE Computer Society Press, Washington (2010)CrossRefGoogle Scholar
  7. 7.
    Blagojevic, F., Nikolopoulos, D.S., Stamatakis, A., Antonopoulos, C.D.: Dynamic multigrain parallelization on the cell broadband engine. In: Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. PPoPP 2007, pp. 90–100. ACM, New York (2007)CrossRefGoogle Scholar
  8. 8.
    Kimura, M.: The number of heterozygous nucleotide sites maintained in a nite population due to steady ux of mutations. Genetics 61(4), 893–903 (1969)Google Scholar
  9. 9.
    Hudson, R.R.: Generating samples under a wright-fisher neutral model of genetic variation. Bioinformatics 18(2), 337–338 (2002)CrossRefGoogle Scholar
  10. 10.
    Gillespie, J.H.: Population genetics: a concise guide. Johns Hopkins Univ. Pr. (2004)Google Scholar
  11. 11.
    Haddrill, P.R., Thornton, K.R., Charlesworth, B., Andolfatto, P.: Multilocus patterns of nucleotide variability and the demographic and selection history of drosophila melanogaster populations. Genome Res. 15(6), 790–799 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Nikolaos Alachiotis
    • 1
  • Pavlos Pavlidis
    • 1
  • Alexandros Stamatakis
    • 1
  1. 1.The Exelixis Lab, Scientific Computing GroupHeidelberg Institute for Theoretical StudiesHeidelbergGermany

Personalised recommendations