Abstract
In the biotechnology field, the deployment of the Multiple Sequence Alignment (MSA) problem, which is a high performance computing demanding process, is one of the new challenges to address on the new parallel systems. The aim of this problem is to find similar regions on biological sequences. Furthermore, the goal of MSA applications is to align as much sequences as possible with a level of quality that makes the alignment biologically meaningful. An efficiency study of different MSA implementations, based on T-Coffee (one of the most used MSA aligners), has been performed in order to find new optimizations that may improve the average execution time on multi-core systems. We found that the current parallel implementations have some performance issues, affecting negatively the scalability of the process. Finally, the proposed implementation based on the usage of threads in conjunction with a message-passing library is presented, with the aim to optimize the execution of the MSA problem in multi-core-based clusters.
Similar content being viewed by others
References
Geer D (2005) Chip makers turn to multicore processors. Computer 38(5):11. doi:10.1109/MC.2005.160
Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32(5):1792–1797
Katoh K, Toh H (2008) Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform 9(4):286–298
Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWillian H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (2007) ClustalW and ClustalX version 2. Bioinformatics 23(21):2947–2948
Notredame C, Higgins DG, Heringa J (2000) T-Coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302(1):205–217. PMID: 10964570 [PubMed—indexed for MEDLINE]
Zola J, Yang X, Rospondek S, Aluru S (2007) Parallel T-Coffee: a parallel multiple sequence aligner. In: Proc. of ISCA PDCS-2007, pp 248–253
Naranjo Y (2009) Multiple sequence alignment with T-Coffee: a parallel approach. Master thesis, UAB
Thompson JD, Plewniak F, Poch O (1999) BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15(1):87–88
Kwok YK, Ahmad I (1999) Benchmarking and comparison of the task graph scheduling algorithms. J Parallel Distrib Comput 59:381–422
Acknowledgements
This work was supported by the MEyC-Spain under contract TIN 2008-05913 and Consolider CSD2007-0050. The CUR of DIUE of GENCAT and the European Social Fund.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Montañola, A., Roig, C., Guirado, F. et al. Performance analysis of computational approaches to solve Multiple Sequence Alignment. J Supercomput 64, 69–78 (2013). https://doi.org/10.1007/s11227-012-0751-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-012-0751-4