Advertisement

Comparing Bowtie and BWA to Align Short Reads from a RNA-Seq Experiment

  • N. Medina-Medina
  • A. Broka
  • S. Lacey
  • H. Lin
  • E. S. Klings
  • C. T. Baldwin
  • M. H. Steinberg
  • P. Sebastiani
Part of the Advances in Intelligent and Soft Computing book series (AINSC, volume 154)

Abstract

High-throughput sequencing technologies are a significant innovation that can contribute to important advances in genetic research. In recent years, many algorithms have been developed to align the large number of short nucleotide sequences generated by these technologies. Choosing within the available alignment algorithms is difficult; to assist this decision we evaluate several algorithms for the mapping of RNA-Seq data. The comparison was completed in two phases. An initial phase narrowed down the comparison to the three algorithms implemented in the tools: ELAND, Bowtie and BWA. A second phase compared the tools in terms of runtime, alignment coverage and process control.

Keywords

RNA-Seq high-throughput sequencing short reads alignment ELAND BWA and Bowtie 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Craig Venter, J., et al.: The sequence of the human genome. Science 291(5507), 1304–1351 (2001); doi:10.1126/science.1058040Google Scholar
  2. 2.
    Sinsheimer, R.L.: Sequencing the human genome: summary report of the Santa Fe workshop. Genomics 5(4), 954–956 (1989)CrossRefGoogle Scholar
  3. 3.
    US Department of Health and Human Services and Department of Energy: Understanding our genetic inheritance. The U.S. human genome project: the first five years. US Dept. of Health and Human Services, Washington, DC (1990)Google Scholar
  4. 4.
    Strauss, E.C., Kobori, J.A., Siu, G., Hood, L.E.: Specific-primer-directed DNA sequencing. Anal. Biochem. 154(1), 353–360 (1986)CrossRefGoogle Scholar
  5. 5.
    Yang, G., Ho, M.-H., Hubbell, E.: High-throughput microarray-based genotyping. In: IEEE Computational Systems Bioinformatics Conference, pp. 586–587 (2004)Google Scholar
  6. 6.
    Hall, N.: Advanced sequencing technologies and their wider impact in microbiology. The Journal of Experimental Biology 210(9), 1518–1525 (2007); doi:10.1242/jeb.001370Google Scholar
  7. 7.
    Pop, M., Salzberg, S., Shumway, M.: Genome sequence assembly: algorithms and issues. IEEE Computer 35, 47–54 (2002)CrossRefGoogle Scholar
  8. 8.
    Mount, D.M.: Bioinformatics: sequence and genome analysis. Cold Spring Harbor Laboratory Press, Cold Spring Harbor,(2004); ISBN: 0-87969-608-7Google Scholar
  9. 9.
    Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3), 443–453 (1970)CrossRefGoogle Scholar
  10. 10.
    Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. Journal of Molecular Biology 147(1), 195–197 (1981)CrossRefGoogle Scholar
  11. 11.
    Drummond, A.J., Ashton, B., Buxton, S., Cheung, M., Cooper, A., Heled, J., Kearse, M., Moir, R., Stones-Havas, S., Sturrock, S., Thierer, T., Wilson, A.: Geneious v5.1 (2010), http://www.geneious.com
  12. 12.
    CLC Main Workbench: A comprehensive workbench for advanced DNA, RNA, and protein analyses, http://www.clcbio.com
  13. 13.
    Langmead, B., Trapnell, C., Pop, M., Salzberg, S.L.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10(3), R25 (2009)Google Scholar
  14. 14.
    Li, H., Durbin, R.: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(19), 1754–1760 (2009)CrossRefGoogle Scholar
  15. 15.
    Giardine, B., Riemer, C., Hardison, R.C., Burhans, R., Elnitski, L., Shah, P., Zhang, Y., Blankenberg, D., Albert, I., Miller, W., et al.: Galaxy: a platform for interactive large-scale genome analysis. Genome Research 15(10), 1451–1455 (2005)CrossRefGoogle Scholar
  16. 16.
    Illumina: Illumina sequencing, http://www.illumina.com
  17. 17.
    Nelson, M.: Data compression with the Burrows-Wheeler transform. Dr. Dobb’s Journal of Software Tools 21(9), 46–50 (1996)Google Scholar
  18. 18.
    Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R.: 1000 genome project data processing subgroup. The sequence alignment/map format and SAMtools. Bioinformatics 25(16), 2078–2079 (2009)CrossRefGoogle Scholar
  19. 19.
    Li, H., Durbin, R.: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14), 1754–1760 (2009)CrossRefGoogle Scholar
  20. 20.
    Hoffmann, S., Otto, C., Kurtz, S., Sharma, C.M., Khaitovich, P., Vogel, J., Stadler, P.F., Hackermuller, J.: Fast mapping of short sequences with mismatches, insertions and deletions using index structures. PLoS Computational Biology 5(9), R1000502 (2009)Google Scholar
  21. 21.
    Ruffalo, M., Laframboise, T., Koyutürk, M.: Comparative analysis of algorithms for next-generation sequencing read alignment. Bioinformatics 27(20), 2790–2796 (2011)CrossRefGoogle Scholar
  22. 22.
    Heng, L., Nils, H.: A survey of sequence alignment algorithms for next-generation sequencing. Briefings in Bioinformatics 11(5), 473–483 (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • N. Medina-Medina
    • 1
  • A. Broka
    • 2
  • S. Lacey
    • 3
  • H. Lin
    • 4
  • E. S. Klings
    • 4
  • C. T. Baldwin
    • 4
  • M. H. Steinberg
    • 4
  • P. Sebastiani
    • 3
  1. 1.Department L.S.I, Technical School of Computer and Telecommunications EngineeringUniversity of GranadaGranadaSpain
  2. 2.Boston University LinGA Computing ResourceBostonUSA
  3. 3.Department of BiostatisticsBoston University School of Public HealthBostonUSA
  4. 4.Department of MedicineBoston University School of MedicineBostonUSA

Personalised recommendations