Soft Computing

, Volume 21, Issue 19, pp 5601–5620 | Cite as

Using mixed mode programming to parallelize an indicator-based evolutionary algorithm for inferring multiobjective phylogenetic histories

  • Sergio Santander-Jiménez
  • Miguel A. Vega-Rodríguez
Focus

Abstract

Multiple problems in bioinformatics research involve the optimization of time-consuming objective functions over exponentially growing search spaces. The capabilities shown by modern parallel systems composed of clustered multicore multiprocessors represent an opportunity to address such difficult problems. A suitable paradigm to exploit these systems lies on the combination of mixed mode programming and evolutionary computation. This research focuses on the reconstruction of multiobjective phylogenetic hypotheses by using an indicator-based evolutionary algorithm. In order to overcome the main sources of complexity of the problem, we propose a parallel adaptation of this algorithm based on master–worker principles. Experimental results on six real data sets report that the design achieves an efficient exploitation of a shared–distributed memory hybrid system composed of 48 processing cores, observing improved scalability in comparison with other parallel proposals. In addition, the inferred Pareto fronts give account of the relevance of the indicator-based design, verifying significant solution quality under different multiobjective metrics and biological testing procedures.

Keywords

Parallel computing Multiobjective optimization Bioinformatics Phylogenetic inference 

References

  1. Adhianto L, Chapman B (2007) Performance modeling of communication and computation in hybrid MPI and OpenMP applications. Simul Model Pract Theory 15(4):481–491CrossRefGoogle Scholar
  2. Bader DA, Chandu VP, Yan M (2006) ExactMP: an efficient parallel exact solver for phylogenetic tree reconstruction using maximum parsimony. In: Proceedings of ICPP 2006. IEEE, pp 65–73Google Scholar
  3. Bader DA, Stamatakis A, Tseng CW (2006) Computational grand challenges in assembling the tree of life: problems and solutions. In: Tseng C-W, Zelkowitz M (eds) Advances in computers, vol 68. Elsevier, Oxford, pp 127–176Google Scholar
  4. Beume N, Fonseca CM, López-Ibáñez M, Paquete L, Vahrenhold J (2009) On the complexity of computing the hypervolume indicator. IEEE Trans Evol Comput 13(5):1075–1082CrossRefGoogle Scholar
  5. Bos DH, Posada D (2005) Using models of nucleotide evolution to build phylogenetic trees. Dev Comp Immunol 29(3):211–227CrossRefGoogle Scholar
  6. Brauer MJ, Holder MT, Dries LA, Zwickl DJ, Lewis PO, Hillis DM (2002) Genetic algorithms and parallel processing in maximum-likelihood phylogeny inference. Mol Biol Evol 19(10):1717–1726CrossRefGoogle Scholar
  7. Bryant D, Galtier N, Poursat MA (2005) Likelihood calculations in molecular phylogenetics. In: Gascuel O (ed) Mathematics of evolution and phylogeny. Oxford University Press, Oxford, pp 33–62Google Scholar
  8. Cancino W, Delbem ACB (2010) A multi-criterion evolutionary approach applied to phylogenetic reconstruction. In: Korosec P (ed) New achievements in evolutionary computation. InTech, Rijeka, pp 135–156Google Scholar
  9. Cancino W, Jourdan L, Talbi E-G, Delbem ACB (2010) Parallel multi-objective approaches for inferring phylogenies. In: Proceedings of EVOBIO’2010, LNCS, vol 6023. Springer, pp 26–37Google Scholar
  10. Chai J, Su H, Zhang C (2012) Performance analysis and comparison of three MrBayes computational biology code on TianHe-1A supercomputer. In: Proceedings of the international conference on computer science and service system 2012. IEEE, pp 2135–2140Google Scholar
  11. Chapman B, Jost G, van der Pas R (2007) Using OpenMP: portable shared memory parallel programming. The MIT Press, CambridgeGoogle Scholar
  12. Chase MW et al (1993) Phylogenetics of seed plants: an analysis of nucleotide sequences from the plastid gene rbcL. Ann Mo Bot Gard 80(3):528–580CrossRefGoogle Scholar
  13. Chor B, Tuller T (2005) Maximum likelihood of evolutionary trees is hard. In: Research in computational molecular biology, LNCS, vol 3500. Springer, pp 296–310Google Scholar
  14. Coelho GP, Silva AEA, Zuben FJV (2010) An immune-inspired multi-objective approach to the reconstruction of phylogenetic trees. Neural Comput Appl 19(8):1103–1132CrossRefGoogle Scholar
  15. Coello C, Dhaenens C, Jourdan L (2010) Advances in multi-objective nature inspired computing. Springer, BerlinCrossRefMATHGoogle Scholar
  16. Cole JR et al (2005) The Ribosomal Database Project (RDP-II): sequences and tools for high-throughput rRNA analysis. Nucleic Acids Res 33:D294–D296CrossRefGoogle Scholar
  17. Congdon CB, Septor KJ (2003) Phylogenetic trees using evolutionary search: initial progress in extending gaphyl to work with genetic data. In: Proceeding of the 2003 congress on evolutionary computation (CEC 2003). IEEE Press, Piscataway, pp 320–326Google Scholar
  18. Darriba D, Taboada GL, Doallo R, Posada D (2012) jModelTest 2: more models, new heuristics and parallel computing. Nat Methods 9(8):772–772CrossRefGoogle Scholar
  19. Day WHE, Johnson DS, Sankoff D (1986) The computational complexity of inferring rooted phylogenies by parsimony. Math Biosci 81(1):33–42MathSciNetCrossRefMATHGoogle Scholar
  20. Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multi-objective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197CrossRefGoogle Scholar
  21. Díaz J, Muñoz-Caro C, Niño A (2012) A survey of parallel programming models and tools in the multi and many-core era. IEEE Trans Parallel Distrib Syst 23(8):1369–1386CrossRefGoogle Scholar
  22. Felsenstein J (2000) PHYLIP (phylogeny inference package). http://evolution.genetics.washington.edu/phylip.html
  23. Figueira JR, Liefooghe A, Talbi E-G, Wierzbicki AP (2010) A parallel multiple reference point approach for multi-objective optimization. Eur J Oper Res 205(2):390–400MathSciNetCrossRefMATHGoogle Scholar
  24. Goëffon A, Richer JM, Hao JK (2008) Progressive tree neighborhood applied to the maximum parsimony problem. IEEE/ACM Trans Comput Biol Bioinform 5(1):136–145CrossRefGoogle Scholar
  25. Goloboff PA, Farris JS, Nixon KC (2008) TNT, a free program for phylogenetic analysis. Cladistics 24(5):774–786CrossRefGoogle Scholar
  26. Gropp W, Lusk W, Skjellum A (2014) Using MPI: portable parallel programming with the message passing interface, 3rd edn. The MIT Press, CambridgeMATHGoogle Scholar
  27. Guéquen L et al (2013) Bio++: efficient extensible libraries and tools for computational molecular evolution. Mol Biol Evol 30(8):1745–1750CrossRefGoogle Scholar
  28. HIV sequence database. http://www.hiv.lanl.gov/ (2005)
  29. Ingman M, Gyllensten U (2006) mtDB: Human Mitochondrial Genome Database, a resource for population genetics and medical sciences. Nucleic Acids Res 34:D749–D751CrossRefGoogle Scholar
  30. Izquierdo-Carrasco F, Alachiotis N, Berger S, Flouri T, Pissis SP, Stamatakis A (2013) A generic vectorization scheme and a GPU kernel for the phylogenetic likelihood library. In: Proceedings of the 27th IEEE international parallel & distributed processing symposium. IEEE, pp 530–538Google Scholar
  31. Jaimes AL, Coello C (2009) Applications of parallel platforms and models in evolutionary multi-objective optimization. In: Biologically-inspired optimisation methods, studies in computational intelligence, vol 210. Springer, pp 23–49Google Scholar
  32. Katoh K, Kuma K, Miyata T (2001) Genetic algorithm-based maximum-likelihood analysis for molecular phylogeny. J Mol Evol 53(4–5):477–484CrossRefGoogle Scholar
  33. Lemey P, Salemi M, Vandamme AM (2009) The phylogenetic handbook: a practical approach to phylogenetic analysis and hypothesis testing. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  34. Lemmon AR, Milinkovitch MC (2002) The metapopulation genetic algorithm: an efficient solution for the problem of large phylogeny estimation. Proc Natl Acad Sci USA 99(16):10516–10521CrossRefGoogle Scholar
  35. León C, Miranda G, Segredo E, Segura C (2009) Parallel library of multi-objective evolutionary algorithms. In: Proceedings of the 2009 17th Euromicro international conference on parallel, distributed and network-based processing. IEEE, pp 28–35Google Scholar
  36. Lewis PO (1998) A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data. Mol Biol Evol 15(3):277–283CrossRefGoogle Scholar
  37. López-Ibáñez M, Dubois-Lacoste J, Stützle T, Birattari M (2011) The irace package, Iterated Race for automatic algorithm configuration. Technical report TR/IRIDIA/2011-004. IRIDIA, Université libre de Bruxelles, BelgiumGoogle Scholar
  38. Macey JR (2005) Plethodontid salamander mitochondrial genomics: a parsimony evaluation of character conflict and implications for historical biogeography. Cladistics 21(2):194–202CrossRefGoogle Scholar
  39. Minh BQ, Vinh LS, von Haeseler A, Schmidt HA (2005) pIQPNNI—parallel reconstruction of large maximum likelihood phylogenies. Bioinformatics 21(19):3794–3796CrossRefGoogle Scholar
  40. Minh BQ, Vinh LS, Schmidt HA, von Haeseler A (2006) Large maximum likelihood trees. In: Proceedings of the NIC symposium. Forschungszentrum Jlich, Germany, pp 357–366Google Scholar
  41. Moore MJ, Jansen RK (2006) Molecular evidence for the age, origin, and evolutionary history of the american desert plant genus tiquilia (boraginaceae). Mol Phylogenet Evol 39(3):668–687CrossRefGoogle Scholar
  42. Ott M, Zola J, Aluru S, Johnson AD, Janies D, Stamatakis A (2008) Large-scale phylogenetic analysis on current HPC architectures. Sci Program 16(2–3):255–270Google Scholar
  43. Pfeiffer W, Stamatakis A (2010) Hybrid MPI/Pthreads parallelization of the RAxML phylogenetics code. In: Proceedings of HiCOMB 2010. IEEE, pp 1–8Google Scholar
  44. Poladian L (2005) A GA for maximum likelihood phylogenetic inference using neighbour-joining as a genotype to phenotype mapping. In: Genetic and evolutionary computation conference, pp 415–422Google Scholar
  45. Poladian L, Jermiin L (2006) Multi-objective evolutionary algorithms and phylogenetic inference with multiple data sets. Soft Comput 10(4):359–368CrossRefGoogle Scholar
  46. Pratas F, Trancoso P, Sousa L, Stamatakis A, Shi G, Kindratenko V (2012) Fine-grain parallelism using multi-core, Cell/BE, and GPU systems. Parallel Comput 38(8):365–390CrossRefGoogle Scholar
  47. Rokas A, Williams BL, King N, Carroll SB (2003) Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature 425(6960):798–804CrossRefGoogle Scholar
  48. Ronquist F et al (2012) MrBayes 3.2: efficient bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61(3):539–542CrossRefGoogle Scholar
  49. Santander-Jiménez S, Vega-Rodríguez MA (2014) Inferring multiobjective phylogenetic hypotheses by using a parallel indicator-based evolutionary algorithm. In: Theory and practice of natural computing, LNCS, vol 8890. Springer, pp 205–217Google Scholar
  50. Santander-Jiménez S, Vega-Rodríguez MA (2015) A hybrid approach to parallelize a fast non-dominated sorting genetic algorithm for phylogenetic inference. Concurr Comput Pract Exp 27(3):702–734CrossRefGoogle Scholar
  51. Sheskin DJ (2011) Handbook of parametric and nonparametric statistical procedures, 5th edn. Chapman & Hall/CRC Press, LondonMATHGoogle Scholar
  52. Shimodaira H, Hasegawa M (2001) CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics 17(12):1246–1247CrossRefGoogle Scholar
  53. Stamatakis A (2014) RAxML Version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30(9):1312–1313CrossRefGoogle Scholar
  54. Stamatakis A, Ott M (2008) Exploiting fine-grained parallelism in the phylogenetic likelihood function with MPI, Pthreads, and OpenMP: a performance study. In: Pattern recognition in bioinformatics, LNBI, vol 5265. Springer, pp 424–436Google Scholar
  55. Wiens JJ, Servedio MR (1998) Phylogenetic analysis and intraspecific variation: performance of parsimony, likelihood, and distance methods. Syst Biol 47(2):228–253CrossRefGoogle Scholar
  56. Zitzler E, Künzli S (2004) Indicator-based selection in multiobjective search. In: PPSN VIII, LNCS, vol 3242. Springer, pp 832–842Google Scholar
  57. Zitzler E, Thiele L, Laumanns M, Fonseca CM, Fonseca VGD (2003) Performance assessment of multiobjective optimizers: an analysis and review. IEEE Trans Evol Comput 7(2):117–132CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Sergio Santander-Jiménez
    • 1
  • Miguel A. Vega-Rodríguez
    • 1
  1. 1.Department of Computer and Communications TechnologiesUniversity of Extremadura, Escuela PolitécnicaCaceresSpain

Personalised recommendations