Workflows with Model Selection: A Multilocus Approach to Phylogenetic Analysis

  • Jorge Álvarez
  • Roberto Blanco
  • Elvira Mayordomo
Part of the Advances in Intelligent and Soft Computing book series (AINSC, volume 93)


The workflow model of description and execution of complex tasks can be of great use to design and parallelize scientific experiments, though it remains a scarcely studied area in its application to phylogenetic analysis. In order to remedy this situation, we study and identify sources of parallel tasks in the main reconstruction stages as well as in other indispensable problems on which it depends: model selection and sequence alignment. Finally, we present a general-purpose implementation for use in cluster environments and examine the performance of our method through application to very large sets of whole mitochondrial genomes, by which problems of biological interest can be solved with new-found efficiency and accuracy.


Model Selection Execution Environment Independent Task Concurrent Level Sequence Alignment Problem 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Álvarez, J.: Análisis teórico-práctico de métodos de inferencia filogenética basados en selección de modelos y métodos de superárboles. Master’s thesis, Zaragoza (2010)Google Scholar
  2. 2.
    Bininda-Emonds, O.R.P., Gittleman, J.L., Steel, M.A.: The (super)tree of life: procedures, problems and prospects. Annu. Rev. Ecol. Syst. 33, 265–289 (2002)CrossRefGoogle Scholar
  3. 3.
    Blanco, R., Mayordomo, E.: ZARAMIT: A system for the evolutionary study of human mitochondrial DNA. In: Omatu, S., Rocha, M.P., Bravo, J., Fernández, F., Corchado, E., Bustillo, A., Corchado, J.M. (eds.) Distributed Computing, Artificial Intelligence, Bioinformatics, Soft Computing, and Ambient Assisted Living, pp. 1139–1142. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  4. 4.
    Blanco, R., Mayordomo, E., Montes, E., Mayo, R., Alberto, A.: Scalable phylogenetics through input preprocessing. In: Rocha, M.P., Riverola, F.F., Shatkay, H., Corchado, J.M. (eds.) IWPACBB 2010. Advances in Intelligent and Soft Computing, vol. 74, pp. 123–130. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  5. 5.
    Bowers, S., McPhillips, T., Riddle, S., Anand, M.K., Ludäscher, B.: Kepler/pPOD: Scientific workflow and provenance support for assembling the tree of life. In: Freire, J., Koop, D., Moreau, L. (eds.) IPAW 2008. LNCS, vol. 5272, pp. 70–77. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Couvares, P., Kosar, T., Roy, A., Weber, J., Wenger, K.: Workflow management in Condor. In: Taylor, I.J., Deelman, E., Gannon, D.B., Shields, M. (eds.) Workflows for e-Science, pp. 357–375. Springer, Heidelberg (2006)Google Scholar
  7. 7.
    Degnan, J.H., Rosenberg, N.A.: Gene tree discordance, phylogenetic inference and the multispecies coalescent. Trends Ecol. Evol. 24, 332–340 (2009)CrossRefGoogle Scholar
  8. 8.
    Georgakopoulos, D., Hornick, M., Sheth, A.: An overview of workflow management: from process modeling to workflow automation infrastructure. Distrib. Parallel Dat. 3, 119–153 (1995)CrossRefGoogle Scholar
  9. 9.
    Holder, M.T., Lewis, P.O.: Phylogeny estimation: traditional and Bayesian approaches. Nat. Rev. Genet. 4, 275–284 (2003)CrossRefGoogle Scholar
  10. 10.
    Oinn, T., Addis, M., Ferris, J., Marvin, D., Senger, M., Greenwood, M., Carver, T., Glover, K., Pocock, M.R., Wipat, A., Li, P.: Taverna: a tool for the composition and enactment of bioinformatics workflows. Bioinformatics 20, 3045–3054 (2004)CrossRefGoogle Scholar
  11. 11.
    Olsen, G.J., Matsuda, H., Hagstrom, R., Overbeek, R.: fastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Comput. Appl. Biosci. 10, 41–48 (1994)Google Scholar
  12. 12.
    Posada, D.: jModelTest: phylogenetic model averaging. Mol. Biol. Evol. 25, 1253–1256 (2008)CrossRefGoogle Scholar
  13. 13.
    Stamatakis, A., Ludwig, T., Meier, H.: RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 21, 456–463 (2005)CrossRefGoogle Scholar
  14. 14.
    Sullivan, J., Joyce, P.: Model selection in phylogenetics. Annu. Rev. Ecol. Evol. Syst. 36, 445–466 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Jorge Álvarez
    • 1
  • Roberto Blanco
    • 1
  • Elvira Mayordomo
    • 1
  1. 1.Departamento de Informática e Ingeniería de Sistemas (DIIS) & Instituto de Investigación en Ingeniería de Aragón (I3A)Universidad de ZaragozaZaragozaSpain

Personalised recommendations