Skip to main content
Log in

Exploring New Search Algorithms and Hardware for Phylogenetics: RAxML Meets the IBM Cell

  • Published:
The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology Aims and scope Submit manuscript

Abstract

Phylogenetic inference is considered to be one of the grand challenges in Bioinformatics due to the immense computational requirements. RAxML is currently among the fastest and most accurate programs for phylogenetic tree inference under the Maximum Likelihood (ML) criterion. First, we introduce new tree search heuristics that accelerate RAxML by a factor of 2.43 while returning equally good trees. The performance of the new search algorithm has been assessed on 18 real-world datasets comprising 148 up to 4,843 DNA sequences. We then present the implementation, optimization, and evaluation of RAxML on the IBM Cell Broadband Engine. We address the problems and provide solutions pertaining to the optimization of floating point code, control flow, communication, and scheduling of multi-level parallelism on the Cell.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. IBM, “Cell broadband engine programming tutorial version 1.0,” Available at: http://www-106.ibm.com/developerworks/eserver/library/es-archguide-v2.html.

  2. D. A. Bader, B. M. E. Moret, and L. Vawter, “Industrial Applications of High-performance Computing for Phylogeny Reconstruction,” in Proc. of SPIE ITCom, vol. 4528, 2001, pp. 159–168.

  3. P. Bellens, J. M. Perez, R. M. Badia, and J. Labarta, “Cells: A Programming Model for the Cell be Architecture,” in Proc. of SC2006, November 2006.

  4. C. Benthin, I. Wald, M. Scherbaum, and H. Friedrich, “Ray Tracing on the CELL Processor,” Technical Report, inTrace Realtime Ray Tracing GmbH, No inTrace-2006-001, 2006.

  5. F. Blagojevic, D. S. Nikolopoulos, A. Stamatakis, and C. D. Antonopoulos, “Dynamic Multigrain Parallelization on the Cell Broadband Engine,” in Proc. of PPoPP 2007, San Jose, CA, March 2007.

  6. B. Chor and T. Tuller, “Maximum Likelihood of Evolutionary Trees: Hardness and Approximation,” Bioinformatics, vol. 21, no. 1, 2005, pp. 97–106.

    Article  Google Scholar 

  7. Z. Du, F. Lin, and U. Roshan, “Reconstruction of Large Phylogenetic Trees: A Parallel Approach,” Computational Biology and Chemistry, vol. 29, no. 4, 2005, pp. 273–280.

    Article  MATH  Google Scholar 

  8. A. E. Eichenberger et al., “Optimizing Compiler for a Cell processor,” Parallel Architectures and Compilation Techniques, September 2005.

  9. D. Pham et al., “The Design and Implementation of a First Generation Cell Processor,” Proc. Int’l Solid-State Circuits Conf. Tech. Digest, IEEE Press, 2005, pp. 184–185.

  10. K. Fatahalian et al., “Sequoia: Programming the Memory Hierarchy,” in Proc. of SC2006, November 2006.

  11. R. E. Ley et al., “Unexpected Diversity and Complexity of the Guerrero Negro Hypersaline Microbial Mat,” Appl. Environ. Microbiol., vol. 72, no. 5, 2006, pp. 3685–3695, May.

    Article  Google Scholar 

  12. J. Felsenstein, “Evolutionary Trees from DNA Sequences: A Maximum Likelihood Approach,” J. Mol. Evol., vol. 17, 1981, pp. 368–376.

    Article  Google Scholar 

  13. G. W. Grimm, S. S. Renner, A. Stamatakis, and V. Hemleben, “A Nuclear Ribosomal DNA Phylogeny of Acer Inferred with Maximum Likelihood, Splits Graphs, and Motif Analyses of 606 Sequences,” Evolutionary Bioinformatics Online, vol. 2, 2006, pp. 279–294.

    Google Scholar 

  14. S. Guindon and O. Gascuel, “A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood,” Syst. Biol., vol. 52, no. 5, 2003, pp. 696–704.

    Article  Google Scholar 

  15. N. Hjelte, Smoothed Particle Hydrodynamics on the Cell Broadband Engine. Masters Thesis, June 2006.

  16. W. Kahan, “Lecture Notes on the Status of IEEE Standard 754 for Binary Floating-point Arithmetic,” in IEEE, 1997.

  17. D. Kunzman, G. Zheng, E. Bohm, and L. V. Kalé, “Charm++, Offload API, and the Cell Processor,” in Proc. of the Workshop on Programming Models for Ubiquitous Parallelism, Seattle, WA, USA, September 2006.

  18. Sun Microsystems, Sun UltraSPARC T1 Cool Threads Technology, December 2005. http://www.sun.com/aboutsun/media/presskits/networkcomputing05q4/T1Infographic.pdf.

  19. B. Q. Minh, L. S. Vinh, A. V. Haeseler, and H. A. Schmidt, “pIQPNNI: Parallel Reconstruction of Large Maximum Likelihood Phylogenies,” Bioinformatics, vol. 21, no. 19, 2005, pp. 3794–3796.

    Article  Google Scholar 

  20. B. Q. Minh, L. S. Vinh, H. A. Schmidt, and A. V. Haeseler, “Large Maximum Likelihood Trees,” in Proc. of the NIC Symposium 2006, 2006, pp. 357–365.

  21. C. E. Robertson, J. K. Harris, J. R. Spear, and N. R. Pace, “Phylogenetic Diversity and Ecology of Environmental Archaea,” Curr. Opin. Microbiol., vol. 8, 2005, pp. 638–642.

    Article  Google Scholar 

  22. F. Ronquist and J. P. Huelsenbeck, “MrBayes 3: Bayesian Phylogenetic Inference under Mixed Models,” Bioinformatics, vol. 19, no. 12, 2003, pp. 1572–1574.

    Article  Google Scholar 

  23. A. Stamatakis, Distributed and Parallel Algorithms and Systems for Inference of Huge Phylogenetic Trees based on the Maximum Likelihood Method, PhD thesis, Technische Universität München, Germany, October 2004.

  24. A. Stamatakis, “Phylogenetic Models of Rate Heterogeneity: A High Performance Computing Perspective,” in Proc. of IPDPS2006, HICOMB Workshop, Proceedings on CD, Rhodos, Greece, April 2006.

  25. A. Stamatakis, “RAxML-VI-HPC: Maximum Likelihood-based Phylogenetic Analyses with Thousands of Taxa and Mixed Models,” Bioinformatics, vol. 22, no. 21, 2006, pp. 2688–2690.

    Article  Google Scholar 

  26. A. Stamatakis, T. Ludwig, and H. Meier, “Parallel Inference of a 10.000-taxon Phylogeny with Maximum Likelihood,” in Proc. of Euro–Par 2004, September 2004, pp. 997–1004.

  27. A. Stamatakis, T. Ludwig, and H. Meier, “RAxML-III: A Fast Program for Maximum Likelihood-based Inference of Large Phylogenetic Trees,” Bioinformatics, vol. 21, no. 4, 2005, pp. 456–463.

    Article  Google Scholar 

  28. A. Stamatakis, M. Ott, and T. Ludwig, “RAxML-OMP: An Efficient Program for Phylogenetic Inference on SMPs,” PaCT, 2005, pp. 288–302.

  29. C. Stewart, D. Hart, D. Berry, G. Olsen, E. Wernert, and W. Fischer, “Parallel Implementation and Performance of FastDNAml—A Program for Maximum Likelihood Phylogenetic Inference,” in Proc. of SC2001, Denver, CO, November 2001.

  30. D. Wang, “Cell Microprocessor III,” Real World Technologies, July 2005.

  31. D. Zwickl, Genetic Algorithm Approaches for the Phylogenetic Analysis of Large Biologiical Sequence Datasets under the Maximum Likelihood Criterion. PhD thesis, University of Texas at Austin, April 2006.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. Stamatakis.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Stamatakis, A., Blagojevic, F., Nikolopoulos, D.S. et al. Exploring New Search Algorithms and Hardware for Phylogenetics: RAxML Meets the IBM Cell. J VLSI Sign Process Syst Sign Im 48, 271–286 (2007). https://doi.org/10.1007/s11265-007-0067-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-007-0067-4

Keywords

Navigation