Exploring New Search Algorithms and Hardware for Phylogenetics: RAxML Meets the IBM Cell
Rent the article at a discountRent now
* Final gross prices may vary according to local VAT.Get Access
Phylogenetic inference is considered to be one of the grand challenges in Bioinformatics due to the immense computational requirements. RAxML is currently among the fastest and most accurate programs for phylogenetic tree inference under the Maximum Likelihood (ML) criterion. First, we introduce new tree search heuristics that accelerate RAxML by a factor of 2.43 while returning equally good trees. The performance of the new search algorithm has been assessed on 18 real-world datasets comprising 148 up to 4,843 DNA sequences. We then present the implementation, optimization, and evaluation of RAxML on the IBM Cell Broadband Engine. We address the problems and provide solutions pertaining to the optimization of floating point code, control flow, communication, and scheduling of multi-level parallelism on the Cell.
- IBM, “Cell broadband engine programming tutorial version 1.0,” Available at: http://www-106.ibm.com/developerworks/eserver/library/es-archguide-v2.html.
- D. A. Bader, B. M. E. Moret, and L. Vawter, “Industrial Applications of High-performance Computing for Phylogeny Reconstruction,” in Proc. of SPIE ITCom, vol. 4528, 2001, pp. 159–168.
- P. Bellens, J. M. Perez, R. M. Badia, and J. Labarta, “Cells: A Programming Model for the Cell be Architecture,” in Proc. of SC2006, November 2006.
- C. Benthin, I. Wald, M. Scherbaum, and H. Friedrich, “Ray Tracing on the CELL Processor,” Technical Report, inTrace Realtime Ray Tracing GmbH, No inTrace-2006-001, 2006.
- F. Blagojevic, D. S. Nikolopoulos, A. Stamatakis, and C. D. Antonopoulos, “Dynamic Multigrain Parallelization on the Cell Broadband Engine,” in Proc. of PPoPP 2007, San Jose, CA, March 2007.
- B. Chor and T. Tuller, “Maximum Likelihood of Evolutionary Trees: Hardness and Approximation,” Bioinformatics, vol. 21, no. 1, 2005, pp. 97–106. CrossRef
- Z. Du, F. Lin, and U. Roshan, “Reconstruction of Large Phylogenetic Trees: A Parallel Approach,” Computational Biology and Chemistry, vol. 29, no. 4, 2005, pp. 273–280. CrossRef
- A. E. Eichenberger et al., “Optimizing Compiler for a Cell processor,” Parallel Architectures and Compilation Techniques, September 2005.
- D. Pham et al., “The Design and Implementation of a First Generation Cell Processor,” Proc. Int’l Solid-State Circuits Conf. Tech. Digest, IEEE Press, 2005, pp. 184–185.
- K. Fatahalian et al., “Sequoia: Programming the Memory Hierarchy,” in Proc. of SC2006, November 2006.
- R. E. Ley et al., “Unexpected Diversity and Complexity of the Guerrero Negro Hypersaline Microbial Mat,” Appl. Environ. Microbiol., vol. 72, no. 5, 2006, pp. 3685–3695, May. CrossRef
- J. Felsenstein, “Evolutionary Trees from DNA Sequences: A Maximum Likelihood Approach,” J. Mol. Evol., vol. 17, 1981, pp. 368–376. CrossRef
- G. W. Grimm, S. S. Renner, A. Stamatakis, and V. Hemleben, “A Nuclear Ribosomal DNA Phylogeny of Acer Inferred with Maximum Likelihood, Splits Graphs, and Motif Analyses of 606 Sequences,” Evolutionary Bioinformatics Online, vol. 2, 2006, pp. 279–294.
- S. Guindon and O. Gascuel, “A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood,” Syst. Biol., vol. 52, no. 5, 2003, pp. 696–704. CrossRef
- N. Hjelte, Smoothed Particle Hydrodynamics on the Cell Broadband Engine. Masters Thesis, June 2006.
- W. Kahan, “Lecture Notes on the Status of IEEE Standard 754 for Binary Floating-point Arithmetic,” in IEEE, 1997.
- D. Kunzman, G. Zheng, E. Bohm, and L. V. Kalé, “Charm++, Offload API, and the Cell Processor,” in Proc. of the Workshop on Programming Models for Ubiquitous Parallelism, Seattle, WA, USA, September 2006.
- Sun Microsystems, Sun UltraSPARC T1 Cool Threads Technology, December 2005. http://www.sun.com/aboutsun/media/presskits/networkcomputing05q4/T1Infographic.pdf.
- B. Q. Minh, L. S. Vinh, A. V. Haeseler, and H. A. Schmidt, “pIQPNNI: Parallel Reconstruction of Large Maximum Likelihood Phylogenies,” Bioinformatics, vol. 21, no. 19, 2005, pp. 3794–3796. CrossRef
- B. Q. Minh, L. S. Vinh, H. A. Schmidt, and A. V. Haeseler, “Large Maximum Likelihood Trees,” in Proc. of the NIC Symposium 2006, 2006, pp. 357–365.
- C. E. Robertson, J. K. Harris, J. R. Spear, and N. R. Pace, “Phylogenetic Diversity and Ecology of Environmental Archaea,” Curr. Opin. Microbiol., vol. 8, 2005, pp. 638–642. CrossRef
- F. Ronquist and J. P. Huelsenbeck, “MrBayes 3: Bayesian Phylogenetic Inference under Mixed Models,” Bioinformatics, vol. 19, no. 12, 2003, pp. 1572–1574. CrossRef
- A. Stamatakis, Distributed and Parallel Algorithms and Systems for Inference of Huge Phylogenetic Trees based on the Maximum Likelihood Method, PhD thesis, Technische Universität München, Germany, October 2004.
- A. Stamatakis, “Phylogenetic Models of Rate Heterogeneity: A High Performance Computing Perspective,” in Proc. of IPDPS2006, HICOMB Workshop, Proceedings on CD, Rhodos, Greece, April 2006.
- A. Stamatakis, “RAxML-VI-HPC: Maximum Likelihood-based Phylogenetic Analyses with Thousands of Taxa and Mixed Models,” Bioinformatics, vol. 22, no. 21, 2006, pp. 2688–2690. CrossRef
- A. Stamatakis, T. Ludwig, and H. Meier, “Parallel Inference of a 10.000-taxon Phylogeny with Maximum Likelihood,” in Proc. of Euro–Par 2004, September 2004, pp. 997–1004.
- A. Stamatakis, T. Ludwig, and H. Meier, “RAxML-III: A Fast Program for Maximum Likelihood-based Inference of Large Phylogenetic Trees,” Bioinformatics, vol. 21, no. 4, 2005, pp. 456–463. CrossRef
- A. Stamatakis, M. Ott, and T. Ludwig, “RAxML-OMP: An Efficient Program for Phylogenetic Inference on SMPs,” PaCT, 2005, pp. 288–302.
- C. Stewart, D. Hart, D. Berry, G. Olsen, E. Wernert, and W. Fischer, “Parallel Implementation and Performance of FastDNAml—A Program for Maximum Likelihood Phylogenetic Inference,” in Proc. of SC2001, Denver, CO, November 2001.
- D. Wang, “Cell Microprocessor III,” Real World Technologies, July 2005.
- D. Zwickl, Genetic Algorithm Approaches for the Phylogenetic Analysis of Large Biologiical Sequence Datasets under the Maximum Likelihood Criterion. PhD thesis, University of Texas at Austin, April 2006.
- Exploring New Search Algorithms and Hardware for Phylogenetics: RAxML Meets the IBM Cell
The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology
Volume 48, Issue 3 , pp 271-286
- Cover Date
- Print ISSN
- Online ISSN
- Springer US
- Additional Links
- phylogenetic inference
- maximum likelihood
- IBM cell
- Industry Sectors
- Author Affiliations
- 1. School of Computer and Communication Sciences, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- 2. Department of Computer Science, Center for High-end Computing Systems, Virginia Tech, Blacksburg, VA, USA
- 3. Department of Computer and Communications Engineering, University of Thessaly, Volos, Greece