Advertisement

The Journal of Supercomputing

, Volume 64, Issue 1, pp 69–78 | Cite as

Performance analysis of computational approaches to solve Multiple Sequence Alignment

  • Alberto MontañolaEmail author
  • Concepció Roig
  • Fernando Guirado
  • Porfidio Hernández
  • Cedric Notredame
Article

Abstract

In the biotechnology field, the deployment of the Multiple Sequence Alignment (MSA) problem, which is a high performance computing demanding process, is one of the new challenges to address on the new parallel systems. The aim of this problem is to find similar regions on biological sequences. Furthermore, the goal of MSA applications is to align as much sequences as possible with a level of quality that makes the alignment biologically meaningful. An efficiency study of different MSA implementations, based on T-Coffee (one of the most used MSA aligners), has been performed in order to find new optimizations that may improve the average execution time on multi-core systems. We found that the current parallel implementations have some performance issues, affecting negatively the scalability of the process. Finally, the proposed implementation based on the usage of threads in conjunction with a message-passing library is presented, with the aim to optimize the execution of the MSA problem in multi-core-based clusters.

Keywords

Multiple sequence alignment T-Coffee MPI Distributed computing Multi-core 

Notes

Acknowledgements

This work was supported by the MEyC-Spain under contract TIN 2008-05913 and Consolider CSD2007-0050. The CUR of DIUE of GENCAT and the European Social Fund.

References

  1. 1.
    Geer D (2005) Chip makers turn to multicore processors. Computer 38(5):11. doi: 10.1109/MC.2005.160 CrossRefGoogle Scholar
  2. 2.
    Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797 CrossRefGoogle Scholar
  3. 3.
    Katoh K, Toh H (2008) Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform 9(4):286–298 CrossRefGoogle Scholar
  4. 4.
    Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWillian H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (2007) ClustalW and ClustalX version 2. Bioinformatics 23(21):2947–2948 CrossRefGoogle Scholar
  5. 5.
    Notredame C, Higgins DG, Heringa J (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302(1):205–217. PMID: 10964570 [PubMed—indexed for MEDLINE] CrossRefGoogle Scholar
  6. 6.
    Zola J, Yang X, Rospondek S, Aluru S (2007) Parallel T-Coffee: a parallel multiple sequence aligner. In: Proc. of ISCA PDCS-2007, pp 248–253 Google Scholar
  7. 7.
    Naranjo Y (2009) Multiple sequence alignment with T-Coffee: a parallel approach. Master thesis, UAB Google Scholar
  8. 8.
    Thompson JD, Plewniak F, Poch O (1999) BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15(1):87–88 CrossRefGoogle Scholar
  9. 9.
    Kwok YK, Ahmad I (1999) Benchmarking and comparison of the task graph scheduling algorithms. J Parallel Distrib Comput 59:381–422 zbMATHCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Alberto Montañola
    • 1
    Email author
  • Concepció Roig
    • 1
  • Fernando Guirado
    • 1
  • Porfidio Hernández
    • 2
  • Cedric Notredame
    • 3
  1. 1.Distributed Computing Group, Computer Science and Industrial Engineering DepartmentUniversitat de LleidaLleidaSpain
  2. 2.Computer Architecture and Operating Systems DepartmentUniversitat Autònoma de BarcelonaCerdanyola del VallèsSpain
  3. 3.Bioinformatics ProgrammeCentre de Regulació Genòmica (CRG-UPF)BarcelonaSpain

Personalised recommendations