Skip to main content

Library Preparation and Data Analysis Packages for Rapid Genome Sequencing

  • Protocol
  • First Online:
Fungal Secondary Metabolism

Abstract

High-throughput sequencing (HTS) has quickly become a valuable tool for comparative genetics and genomics and is now regularly carried out in laboratories that are not connected to large sequencing centers. Here we describe an updated version of our protocol for constructing single- and paired-end Illumina sequencing libraries, beginning with purified genomic DNA. The present protocol can also be used for “multiplexing,” i.e. the analysis of several samples in a single flowcell lane by generating “barcoded” or “indexed” Illumina sequencing libraries in a way that is independent from Illumina-supported methods. To analyze sequencing results, we suggest several independent approaches but end users should be aware that this is a quickly evolving field and that currently many alignment (or “mapping”) and counting algorithms are being developed and tested.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Blumenstiel JP, Noll AC, Griffiths JA et al (2009) Identification of EMS-induced mutations in Drosophila melanogaster by whole-genome sequencing. Genetics 182:25–32

    Article  PubMed  CAS  Google Scholar 

  2. Birkeland SR, Jin N, Ozdemir AC et al (2010) Discovery of mutations in Saccharomyces cerevisiae by pooled linkage analysis and whole-genome sequencing. Genetics 186:1127–1137

    Article  PubMed  CAS  Google Scholar 

  3. Ehrenreich IM, Torabi N, Jia Y et al (2010) Dissection of genetically complex traits with extremely large pools of yeast segregants. Nature 464:1039–1042

    Article  PubMed  CAS  Google Scholar 

  4. Wenger JW, Schwartz K, Sherlock G (2010) Bulk segregant analysis by high-throughput sequencing reveals a novel xylose utilization gene from Saccharomyces cerevisiae. PLoS Genet 6:e1000942

    Article  PubMed  Google Scholar 

  5. Pomraning KR, Smith KM, Freitag M (2011) Bulk segregant analysis followed by high-throughput sequencing reveals the Neurospora cell cycle gene, ndc-1, to be allelic with the gene for ornithine decarboxylase, spe-1. Eukaryot Cell 10:724–733

    Article  PubMed  CAS  Google Scholar 

  6. Reinhardt JA, Baltrus DA, Nishimura MT et al (2009) De novo assembly using low-coverage short read sequence data from the rice pathogen Pseudomonas syringae pv. oryzae. Genome Res 19:294–305

    Article  PubMed  CAS  Google Scholar 

  7. Diguistini S, Liao NY, Platt D et al (2009) De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data. Genome Biol 10:R94

    Article  PubMed  Google Scholar 

  8. Li R, Fan W, Tian G et al (2010) The sequence and de novo assembly of the giant panda genome. Nature 463:311–317

    Article  PubMed  CAS  Google Scholar 

  9. Nowrousian M, Stajich JE, Chu M et al (2010) De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis. PLoS Genet 6:e1000891

    Article  PubMed  Google Scholar 

  10. Pomraning KR, Connolly LR, Whalen JP et al (2011) Repeat-induced point mutation, DNA methylation and heterochromatin in Gibberella zeae (anamorph: Fusarium graminearum). In: Brown D, Proctor RH (eds) Fusarium genomics and molecular and cellular biology. Horizon Scientific Press, Norwich

    Google Scholar 

  11. Pomraning KR, Smith KM, Freitag M (2009) Genome-wide high throughput analysis of DNA methylation in eukaryotes. Methods 47:142–150

    Article  PubMed  CAS  Google Scholar 

  12. Quail MA, Kozarewa I, Smith F et al (2008) A large genome center’s improvements to the Illumina sequencing system. Nat Methods 5:1005–1010

    Article  PubMed  CAS  Google Scholar 

  13. Cronn R, Liston A, Parks M et al (2008) Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology. Nucleic Acids Res 36:e122

    Article  PubMed  Google Scholar 

  14. Langmead B, Trapnell C, Pop M et al (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25

    Article  PubMed  Google Scholar 

  15. Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829

    Article  PubMed  CAS  Google Scholar 

  16. Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18:1851–1858

    Article  PubMed  CAS  Google Scholar 

  17. Lin Y, Li J, Shen H et al (2011) Comparative studies of de novo assembly tools for next-generation sequencing technologies. Bioinformatics 27:2031–2037

    Article  PubMed  CAS  Google Scholar 

  18. Simpson JT, Wong K, Jackman SD et al (2009) ABySS: a parallel assembler for short read sequence data. Genome Res 19:1117–1123

    Article  PubMed  CAS  Google Scholar 

  19. Filichkin SA, Priest HD, Givan SA et al (2010) Genome-wide mapping of alternative splicing in Arabidopsis thaliana. Genome Res 20:45–58

    Article  PubMed  CAS  Google Scholar 

  20. Bryant DW Jr, Wong WK, Mockler TC (2009) QSRA: a quality-value guided de novo short read assembler. BMC Bioinformatics 10:69

    Article  PubMed  Google Scholar 

  21. Li R, Zhu H, Ruan J et al (2010) De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 20:265–272

    Article  PubMed  CAS  Google Scholar 

  22. Boisvert S, Laviolette F, Corbeil J (2010) Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. J Comput Biol 17:1519–1533

    Article  PubMed  CAS  Google Scholar 

  23. Li R, Li Y, Kristiansen K et al (2008) SOAP: short oligonucleotide alignment program. Bioinformatics 24:713–714

    Article  PubMed  CAS  Google Scholar 

  24. Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760

    Article  PubMed  CAS  Google Scholar 

  25. Fahlgren N, Sullivan CM, Kasschau KD et al (2009) Computational and analytical framework for small RNA profiling by high-throughput sequencing. RNA 15:992–1002

    Article  PubMed  CAS  Google Scholar 

  26. Lunter G, Goodson M (2011) Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res 21:936–939

    Article  PubMed  CAS  Google Scholar 

  27. Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079

    Article  PubMed  Google Scholar 

  28. Smith KM, Phatale PA, Sullivan CM et al (2011) Heterochromatin is required for normal distribution of Neurospora CenH3. Mol Cell Biol 31:2528–2542

    Article  PubMed  CAS  Google Scholar 

  29. Smith KM, Sancar G, Dekhang R et al (2010) Transcription factors in light and circadian clock signaling networks revealed by genomewide mapping of direct targets for Neurospora White Collar Complex. Eukaryot Cell 9:1549–1556

    Article  PubMed  CAS  Google Scholar 

  30. Zhang Y, Liu T, Meyer CA et al (2008) Model-based analysis of ChIP-Seq (MACS). Genome Biol 9:R137

    Article  PubMed  Google Scholar 

  31. Zang C, Schones DE, Zeng C et al (2009) A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics 25:1952–1958

    Article  PubMed  CAS  Google Scholar 

  32. Song Q, Smith AD (2011) Identifying dispersed epigenomic domains from ChIP-Seq data. Bioinformatics 27:870–871

    Article  PubMed  CAS  Google Scholar 

  33. Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105–1111

    Article  PubMed  CAS  Google Scholar 

  34. Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11:R106

    Article  PubMed  CAS  Google Scholar 

  35. Trapnell C, Williams BA, Pertea G et al (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28:511–515

    Article  PubMed  CAS  Google Scholar 

  36. Singh D, Orellana CF, Hu Y et al (2011) FDM: a graph-based statistical method to detect differential transcription using RNA-seq data. Bioinformatics 27:2633–2640

    Article  PubMed  CAS  Google Scholar 

  37. Cumbie JS, Kimbrel JA, Di Y et al (2011) GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences. PLoS One 6:e25279

    Article  PubMed  CAS  Google Scholar 

  38. Aird D, Ross MG, Chen WS et al (2011) Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Genome Biol 12:R18

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

We thank Mark Dasenko, Chris Sullivan, Steve Drake, Matthew Peterson, and Scott Givan at the OSU CGRB core facility for assistance with Illumina sequencing, and Chris Sullivan, Jason Cumbie, Noah Fahlgren and Henry Priest for helpful discussions and sharing code. Work in our laboratory is supported by funds from the American Cancer Society (RSG-08-030-01-CCG), the National Institutes of Health (P01GM068087 and R01GM097637), and start-up funds from the OSU Computational and Genome Biology Initiative. The authors have no conflicting interests

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Freitag .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Pomraning, K.R., Smith, K.M., Bredeweg, E.L., Connolly, L.R., Phatale, P.A., Freitag, M. (2012). Library Preparation and Data Analysis Packages for Rapid Genome Sequencing. In: Keller, N., Turner, G. (eds) Fungal Secondary Metabolism. Methods in Molecular Biology, vol 944. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-122-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-122-6_1

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-121-9

  • Online ISBN: 978-1-62703-122-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics