Resource-efficient utilization of CPU/GPU-based heterogeneous supercomputers for Bayesian phylogenetic inference
- 335 Downloads
Bayesian inference is one of the most important methods for estimating phylogenetic trees in bioinformatics. Due to the potentially huge computational requirements, several parallel algorithms of Bayesian inference have been implemented to run on CPU-based clusters, multicore CPUs, or small clusters of CPUs and GPUs. To the best of our knowledge, however, none of the existing methods is able to simultaneously and fully utilize both CPUs and GPUs for the computations, leaving idle either the CPU part or the GPU part of modern heterogeneous supercomputers. Aiming at an optimized utilization of heterogeneous computing resources, which is a promising hardware architecture for future bioinformatics applications, we present a new hybrid parallel algorithm and implementation of Bayesian phylogenetic inference, which combines MPI, OpenMP, and CUDA programming. The novelty of our algorithm, denoted as oMC3, is its ability of using CPU cores simultaneously with GPUs for the computations, while ensuring a fair work division between the two types of hardware components. We have implemented oMC3 based on MrBayes, which is one of the most popular software packages for Bayesian phylogenetic inference. Numerical experiments show that oMC3 obtains 2.5× speedup over nMC3, which is a cutting-edge GPU implementation of MrBayes, on a single server consisting of two GPUs and sixteen CPU cores. Moreover, oMC3 scales nicely when 128 GPUs and 1536 CPU cores are in use.
KeywordsBayesian inference CPU/GPU-based heterogeneous Supercomputer Hybrid programming Resource-efficient utilization
The authors gratefully acknowledge the support from National Natural Science Foundation of China under NSFC Nos. 61033008, 60903041, and 61103080, Research Fund for the Doctoral Program of Higher Education of China under SRFDP No. 20104307110002, Hunan Provincial Innovation Foundation for Postgraduate under No. CX2010B028, Fund of Innovation in Graduate School of NUDT under No. B100603, and Research Grant No. 214113 from the Research Council of Norway. We also acknowledge the experimental platform support from the National Supercomputing Center in Changsha.
- 2.U.S. National Science Foundation (2004) Assembling the tree of life (ATOL): to construct a phylogeny for the 1.7 million described species of life. National Science Foundation, Program Solicitation, NSF 04-526 Google Scholar
- 3.Feng X, Cameron KW, Buell DA (2006) PBPI: a high performance implementation of Bayesian phylogenetic inference. In: Proceedings of the 2006 ACM/IEEE conference on supercomputing. IEEE, New York Google Scholar
- 7.Feng X, Cameron KW, Sosa CP, Smith B (2007) Building the tree of life on terascale systems. In: Proceedings of the 21st international parallel and distributed processing symposium, Long Beach, CA, March 2007, pp 1–10. Google Scholar
- 9.Pfeiffer W, Stamatakis A s (2010) Hybrid parallelization of the MrBayes & RAxML phylogenetics codes. http://sco.h-its.org/exelixis/Phylo100225.pdf
- 10.Zhou J, Wang G, Liu X (2010) A new hybrid parallel algorithm for MrBayes. In: Proceedings ICA3PP 2010. LNCS, vol 6081, pp 102–112 Google Scholar
- 11.Pratas F, Trancoso P, Stamatakis A, Sousa L (2009) Fine-grain parallelism using multi-core, cell/BE, and GPU systems: accelerating the phylogenetic likelihood function. In: Proceedings ICPP 2009. IEEE Computer Society, Los Alamitos, pp 9–17 Google Scholar
- 14.Ayres DL, Darling A, Zwickl DJ, Beerli P, Holder MT, Lewis PO, Huelsenbeck JP, Ronquist F, Swofford DL, Cummings MP, Rambaut A, Suchard MA (2012) BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics. Syst Biol 61(1):170–173 CrossRefGoogle Scholar
- 20.Shimokawabe T, Aoki T, Takaki A, Yamanaka A, Nukada T, Endo N, Maruyama S, Matsuoka S (2011) Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer. In: Proceedings of the 2011 ACM/IEEE conference on supercomputing, Seattle, WA. IEEE, New York Google Scholar
- 22.MrBayes version3.1.2. http://sourceforge.net/projects/mrbayes/files/mrbayes/3.1.2/
- 23.NVIDIA CUDA C programming guide version 4.1. http://developer.nvidia.com/cuda-downloads
- 24.Optimal MrBayes (version 1.0) user manual. https://sourceforge.net/projects/optimal-mrbayes/