Skip to main content

Sequencing and Genome Assembly Using Next-Generation Technologies

  • Protocol
  • First Online:
Computational Biology

Part of the book series: Methods in Molecular Biology ((MIMB,volume 673))

Abstract

Several sequencing technologies have been introduced in recent years that dramatically outperform the traditional Sanger technology in terms of throughput and cost. The data generated by these technologies are characterized by generally shorter read lengths (as low as 35 bp) and different error characteristics than Sanger data. Existing software tools for assembly and analysis of sequencing data are, therefore, ill-suited to handle the new types of data generated. This paper surveys the recent software packages aimed specifically at analyzing new generation sequencing data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Mardis, E. R. (2008) The impact of nextgeneration sequencing technology on genetics, Trends Genet 24, 133–141.

    Article  PubMed  CAS  Google Scholar 

  2. Pop, M. (2004) Shotgun sequence assembly, Adv Comput 60, 193–248.

    Article  Google Scholar 

  3. Pop, M., and Salzberg, S. L. (2008) Bioinformatics challenges of new sequencing technology, Trends Genet 24, 142–149.

    Article  PubMed  CAS  Google Scholar 

  4. Ronaghi, M., Uhlen, M., and Nyren, P. (1998) A sequencing method based on real-time pyrophosphate, Science 281, 363–365.

    Article  PubMed  CAS  Google Scholar 

  5. Huse, S. M., Huber, J. A., Morrison, H. G., Sogin, M. L., and Welch, D. M. (2007) Accuracy and quality of massively parallel DNA pyrosequencing, Genome Biol 8, R143.

    Article  PubMed  Google Scholar 

  6. Margulies, M., Egholm, M., Altman, W. E., Attiya, S., Bader, J. S., Bemben, L. A., Berka, J., Braverman, M. S., Chen, Y. J., Chen, Z., Dewell, S. B., Du, L., Fierro, J. M., Gomes, X. V., Godwin, B. C., He, W., Helgesen, S., Ho, C. H., Irzyk, G. P., Jando, S. C., Alenquer, M. L., Jarvie, T. P., Jirage, K. B., Kim, J. B., Knight, J. R., Lanza, J. R., Leamon, J. H., Lefkowitz, S. M., Lei, M., Li, J., Lohman, K. L., Lu, H., Makhijani, V. B., McDade, K. E., McKenna, M. P., Myers, E. W., Nickerson, E., Nobile, J. R., Plant, R., Puc, B. P., Ronan, M. T., Roth, G. T., Sarkis, G. J., Simons, J. F., Simpson, J. W., Srinivasan, M., Tartaro, K. R., Tomasz, A., Vogt, K. A., Volkmer, G. A., Wang, S. H., Wang, Y., Weiner, M. P., Yu, P., Begley, R. F., and Rothberg, J. M. (2005) Genome sequencing in microfabricated high-density picolitre reactors, Nature 437, 376–380.

    PubMed  CAS  Google Scholar 

  7. Wheeler, D. A., Srinivasan, M., Egholm, M., Shen, Y., Chen, L., McGuire, A., He, W., Chen, Y. J., Makhijani, V., Roth, G. T., Gomes, X., Tartaro, K., Niazi, F., Turcotte, C. L., Irzyk, G. P., Lupski, J. R., Chinault, C., Song, X. Z., Liu, Y., Yuan, Y., Nazareth, L., Qin, X., Muzny, D. M., Margulies, M., Weinstock, G. M., Gibbs, R. A., and Rothberg, J. M. (2008) The complete genome of an individual by massively parallel DNA sequencing, Nature 452, 872–876.

    Article  PubMed  CAS  Google Scholar 

  8. Emrich, S. J., Barbazuk, W. B., Li, L., and Schnable, P. S. (2007) Gene discovery and annotation using LCM-454 transcriptome sequencing, Genome Res 17, 69–73.

    Article  PubMed  CAS  Google Scholar 

  9. Harris, T. D., Buzby, P. R., Babcock, H., Beer, E., Bowers, J., Braslavsky, I., Causey, M., Colonell, J., Dimeo, J., Efcavitch, J. W., Giladi, E., Gill, J., Healy, J., Jarosz, M., Lapen, D., Moulton, K., Quake, S. R., Steinmann, K., Thayer, E., Tyurina, A., Ward, R., Weiss, H., and Xie, Z. (2008) Single-molecule DNA sequencing of a viral genome, Science 320, 106–109.

    Article  PubMed  CAS  Google Scholar 

  10. Eid, J., Fehr, A., Gray, J., Luong, K., Lyle, J., Otto, G., Peluso, P., Rank, D., Baybayan, P., Bettman, B., Bibillo, A., Bjornson, K., Chaudhuri, B., Christians, F., Cicero, R., Clark, S., Dalal, R., Dewinter, A., Dixon, J., Foquet, M., Gaertner, A., Hardenbol, P., Heiner, C., Hester, K., Holden, D., Kearns, G., Kong, X., Kuse, R., Lacroix, Y., Lin, S., Lundquist, P., Ma, C., Marks, P., Maxham, M., Murphy, D., Park, I., Pham, T., Phillips, M., Roy, J., Sebra, R., Shen, G., Sorenson, J., Tomaney, A., Travers, K., Trulson, M., Vieceli, J., Wegener, J., Wu, D., Yang, A., Zaccarin, D., Zhao, P., Zhong, F., Korlach, J., and Turner, S. (2009) Real-time DNA sequencing from single polymerase molecules, Science 323, 133–138.

    Article  PubMed  CAS  Google Scholar 

  11. Myers, E. W., Sutton, G. G., Delcher, A. L., Dew, I. M., Fasulo, D. P., Flanigan, M. J., Kravitz, S. A., Mobarry, C. M., Reinert, K. H., Remington, K. A., Anson, E. L., Bolanos, R. A., Chou, H. H., Jordan, C. M., Halpern, A. L., Lonardi, S., Beasley, E. M., Brandon, R. C., Chen, L., Dunn, P. J., Lai, Z., Liang, Y., Nusskern, D. R., Zhan, M., Zhang, Q., Zheng, X., Rubin, G. M., Adams, M. D., and Venter, J. C. (2000) A whole-genome assembly of Drosophila, Science 287, 2196–2204.

    Article  PubMed  CAS  Google Scholar 

  12. Batzoglou, S., Jaffe, D. B., Stanley, K., Butler, J., Gnerre, S., Mauceli, E., Berger, B., Mesirov, J. P., and Lander, E. S. (2002) ARACHNE: a whole-genome shotgun assembler, Genome Res 12, 177–189.

    Article  PubMed  CAS  Google Scholar 

  13. Jaffe, D. B., Butler, J., Gnerre, S., Mauceli, E., Lindblad-Toh, K., Mesirov, J. P., Zody, M. C., and Lander, E. S. (2003) Whole-genome sequence assembly for mammalian genomes: Arachne 2, Genome Res 13, 91–96.

    Article  PubMed  CAS  Google Scholar 

  14. Green, P. (1994) Statistical aspects of imaging, Stat Methods Med Res 3, 1–3.

    Article  PubMed  CAS  Google Scholar 

  15. Hillier, L. W., Marth, G. T., Quinlan, A. R., Dooling, D., Fewell, G., Barnett, D., Fox, P., Glasscock, J. I., Hickenbotham, M., Huang, W., Magrini, V. J., Richt, R. J., Sander, S. N., Stewart, D. A., Stromberg, M., Tsung, E. F., Wylie, T., Schedl, T., Wilson, R. K., and Mardis, E. R. (2008) Whole-genome sequencing and variant discovery in C. elegans, Nat Methods 5, 183–188.

    Article  PubMed  CAS  Google Scholar 

  16. Salzberg, S. L., Sommer, D. D., Puiu, D., and Lee, V. T. (2008) Gene-boosted assembly of a novel bacterial genome from very short reads, PLoS Comput Biol 4, e1000186.

    Article  PubMed  Google Scholar 

  17. Friedlander, M. R., Chen, W., Adamidi, C., Maaskola, J., Einspanier, R., Knespel, S., and Rajewsky, N. (2008) Discovering microRNAs from deep sequencing data using miRDeep, Nat Biotechnol 26, 407–415.

    Article  PubMed  Google Scholar 

  18. Down, T. A., Rakyan, V. K., Turner, D. J., Flicek, P., Li, H., Kulesha, E., Graf, S., Johnson, N., Herrero, J., Tomazou, E. M., Thorne, N. P., Backdahl, L., Herberth, M., Howe, K. L., Jackson, D. K., Miretti, M. M., Marioni, J. C., Birney, E., Hubbard, T. J., Durbin, R., Tavare, S., and Beck, S. (2008) A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis, Nat Biotechnol 26, 779–785.

    Article  PubMed  CAS  Google Scholar 

  19. Kurtz, S., Phillippy, A., Delcher, A. L., Smoot, M., Shumway, M., Antonescu, C., and Salzberg, S. L. (2004) Versatile and open software for comparing large genomes, Genome Biol 5, R12.

    Article  PubMed  Google Scholar 

  20. Kent, W. J. (2002) BLAT–the BLAST-like alignment tool, Genome Res 12, 656–664.

    PubMed  CAS  Google Scholar 

  21. Johnson, D. S., Mortazavi, A., Myers, R. M., and Wold, B. (2007) Genome-wide mapping of in vivo protein-DNA interactions, Science 316, 1497–1502.

    Article  PubMed  CAS  Google Scholar 

  22. Li, H., Ruan, J., and Durbin, R. (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores, Genome Res 18, 1851–1858.

    Article  PubMed  CAS  Google Scholar 

  23. Li, R., Li, Y., Kristiansen, K., and Wang, J. (2008) SOAP: short oligonucleotide alignment program, Bioinformatics 24, 713–714.

    Article  PubMed  CAS  Google Scholar 

  24. Langmead, B., Trapnell, C., Pop, M., and Salzberg, S. L. (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome, Genome Biol 10, R25.

    Article  PubMed  Google Scholar 

  25. Jiang, H., and Wong, W. H. (2008) SeqMap: mapping massive amount of oligonucleotides to the genome, Bioinformatics 24, 2395–2396.

    Article  PubMed  CAS  Google Scholar 

  26. Lin, H., Zhang, Z., Zhang, M. Q., Ma, B., and Li, M. (2008) ZOOM! Zillions of oligos mapped, Bioinformatics 24, 2431–2437.

    Article  PubMed  CAS  Google Scholar 

  27. Pop, M., Phillippy, A., Delcher, A. L., and Salzberg, S. L. (2004) Comparative genome assembly, Brief Bioinform 5, 237–248.

    Article  PubMed  CAS  Google Scholar 

  28. Warren, R. L., Sutton, G. G., Jones, S. J., and Holt, R. A. (2007) Assembling millions of short DNA sequences using SSAKE, Bioinformatics 23, 500–501.

    Article  PubMed  CAS  Google Scholar 

  29. Jeck, W. R., Reinhardt, J. A., Baltrus, D. A., Hickenbotham, M. T., Magrini, V., Mardis, E. R., Dangl, J. L., and Jones, C. D. (2007) Extending assembly of short DNA sequences to handle error, Bioinformatics 23, 2942–2944.

    Article  PubMed  CAS  Google Scholar 

  30. Dohm, J. C., Lottaz, C., Borodina, T., and Himmelbauer, H. (2007) SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing, Genome Res 17, 1697–1706.

    Article  PubMed  CAS  Google Scholar 

  31. Hernandez, D., Francois, P., Farinelli, L., Osteras, M., and Schrenzel, J. (2008) De novo bacterial genome sequencing: millions of very short reads assembled on a desktop computer, Genome Res 18, 802–809.

    Article  PubMed  CAS  Google Scholar 

  32. Zerbino, D. R., and Birney, E. (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs, Genome Res 18, 821–829.

    Article  PubMed  CAS  Google Scholar 

  33. Pevzner, P. A., Tang, H., and Waterman, M. S. (2001) An Eulerian path approach to DNA fragment assembly, Proc Natl Acad Sci U S A 98, 9748–9753.

    Article  PubMed  CAS  Google Scholar 

  34. Simpson, J. T., Wong, K., Jackman, S. D., Schein, J. E., Jones, S. J., and Birol, I. (2009) ABySS: a parallel assembler for short read sequence data, Genome Res 19, 1117–1123.

    Article  PubMed  CAS  Google Scholar 

  35. Sommer, D. D., Delcher, A. L., Salzberg, S. L., and Pop, M. (2007) Minimus: a fast, lightweight genome assembler, BMC Bioinformatics 8, 64.

    Article  PubMed  Google Scholar 

  36. Butler, J., MacCallum, I., Kleber, M., Shlyakhter, I. A., Belmonte, M. K., Lander, E. S., Nusbaum, C., and Jaffe, D. B. (2008) ALLPATHS: de novo assembly of whole-genome shotgun microreads, Genome Res 18, 810–820.

    Article  PubMed  CAS  Google Scholar 

  37. Pop, M., Kosack, D. S., and Salzberg, S. L. (2004) Hierarchical scaffolding with Bambus, Genome Res 14, 149–159.

    Article  PubMed  CAS  Google Scholar 

  38. Samad, A., Huff, E. F., Cai, W., and Schwartz, D. C. (1995) Optical mapping: a novel, single-molecule approach to genomic analysis, Genome Res 5, 1–4.

    Article  PubMed  CAS  Google Scholar 

  39. Nagarajan, N., Read, T. D., and Pop, M. (2008) Scaffolding and validation of bacterial genome assemblies using optical restriction maps, Bioinformatics 24, 1229–1235.

    Article  PubMed  CAS  Google Scholar 

  40. Goldberg, S. M., Johnson, J., Busam, D., Feldblyum, T., Ferriera, S., Friedman, R., Halpern, A., Khouri, H., Kravitz, S. A., Lauro, F. M., Li, K., Rogers, Y. H., Strausberg, R., Sutton, G., Tallon, L., Thomas, T., Venter, E., Frazier, M., and Venter, J. C. (2006) A Sanger/pyrosequencing hybrid approach for the generation of high-quality draft assemblies of marine microbial genomes, Proc Natl Acad Sci U S A 103, 11240–11245.

    Article  PubMed  CAS  Google Scholar 

  41. Miller, J. R., Delcher, A. L., Koren, S., Venter, E., Walenz, B. P., Brownley, A., Johnson, J., Li, K., Mobarry, C., and Sutton, G. (2008) Aggressive assembly of pyrosequencing reads with mates, Bioinformatics 24, 2818–2824.

    Article  PubMed  CAS  Google Scholar 

  42. Lee, S., Cheran, E., and Brudno, M. (2008) A robust framework for detecting structural variations in a genome, Bioinformatics 24, i59–i67.

    Article  PubMed  CAS  Google Scholar 

  43. Huson, D. H., Auch, A. F., Qi, J., and Schuster, S. C. (2007) MEGAN analysis of metagenomic data, Genome Res 17, 377–386.

    Article  PubMed  CAS  Google Scholar 

  44. Krause, L., Diaz, N. N., Goesmann, A., Kelley, S., Nattkemper, T. W., Rohwer, F., Edwards, R. A., and Stoye, J. (2008) Phylogenetic classification of short environmental DNA fragments, Nucleic Acids Res 36, 2230–2239.

    Article  PubMed  CAS  Google Scholar 

  45. Ye, Y., and Tang, X. (2008) in “Proceedings of the Seventh Annual International Conference on Computational Systems Bioinformatics”, Stanford, CA.

    Google Scholar 

  46. Moxon, S., Schwach, F., Dalmay, T., Maclean, D., Studholme, D. J., and Moulton, V. (2008) A toolkit for analysing large-scale plant small RNA datasets, Bioinformatics 24, 2252–2253.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Nagarajan, N., Pop, M. (2010). Sequencing and Genome Assembly Using Next-Generation Technologies. In: Fenyö, D. (eds) Computational Biology. Methods in Molecular Biology, vol 673. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60761-842-3_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-842-3_1

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-60761-841-6

  • Online ISBN: 978-1-60761-842-3

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics