Library Preparation and Data Analysis Packages for Rapid Genome Sequencing

Pomraning, Kyle R.; Smith, Kristina M.; Bredeweg, Erin L.; Connolly, Lanelle R.; Phatale, Pallavi A.; Freitag, Michael

doi:10.1007/978-1-62703-122-6_1

Kyle R. Pomraning³,
Kristina M. Smith³,
Erin L. Bredeweg³,
Lanelle R. Connolly³,
Pallavi A. Phatale³ &
…
Michael Freitag³

Part of the book series: Methods in Molecular Biology ((MIMB,volume 944))

3339 Accesses
15 Citations

Abstract

High-throughput sequencing (HTS) has quickly become a valuable tool for comparative genetics and genomics and is now regularly carried out in laboratories that are not connected to large sequencing centers. Here we describe an updated version of our protocol for constructing single- and paired-end Illumina sequencing libraries, beginning with purified genomic DNA. The present protocol can also be used for “multiplexing,” i.e. the analysis of several samples in a single flowcell lane by generating “barcoded” or “indexed” Illumina sequencing libraries in a way that is independent from Illumina-supported methods. To analyze sequencing results, we suggest several independent approaches but end users should be aware that this is a quickly evolving field and that currently many alignment (or “mapping”) and counting algorithms are being developed and tested.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Blumenstiel JP, Noll AC, Griffiths JA et al (2009) Identification of EMS-induced mutations in Drosophila melanogaster by whole-genome sequencing. Genetics 182:25–32
Article PubMed CAS Google Scholar
Birkeland SR, Jin N, Ozdemir AC et al (2010) Discovery of mutations in Saccharomyces cerevisiae by pooled linkage analysis and whole-genome sequencing. Genetics 186:1127–1137
Article PubMed CAS Google Scholar
Ehrenreich IM, Torabi N, Jia Y et al (2010) Dissection of genetically complex traits with extremely large pools of yeast segregants. Nature 464:1039–1042
Article PubMed CAS Google Scholar
Wenger JW, Schwartz K, Sherlock G (2010) Bulk segregant analysis by high-throughput sequencing reveals a novel xylose utilization gene from Saccharomyces cerevisiae. PLoS Genet 6:e1000942
Article PubMed Google Scholar
Pomraning KR, Smith KM, Freitag M (2011) Bulk segregant analysis followed by high-throughput sequencing reveals the Neurospora cell cycle gene, ndc-1, to be allelic with the gene for ornithine decarboxylase, spe-1. Eukaryot Cell 10:724–733
Article PubMed CAS Google Scholar
Reinhardt JA, Baltrus DA, Nishimura MT et al (2009) De novo assembly using low-coverage short read sequence data from the rice pathogen Pseudomonas syringae pv. oryzae. Genome Res 19:294–305
Article PubMed CAS Google Scholar
Diguistini S, Liao NY, Platt D et al (2009) De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data. Genome Biol 10:R94
Article PubMed Google Scholar
Li R, Fan W, Tian G et al (2010) The sequence and de novo assembly of the giant panda genome. Nature 463:311–317
Article PubMed CAS Google Scholar
Nowrousian M, Stajich JE, Chu M et al (2010) De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis. PLoS Genet 6:e1000891
Article PubMed Google Scholar
Pomraning KR, Connolly LR, Whalen JP et al (2011) Repeat-induced point mutation, DNA methylation and heterochromatin in Gibberella zeae (anamorph: Fusarium graminearum). In: Brown D, Proctor RH (eds) Fusarium genomics and molecular and cellular biology. Horizon Scientific Press, Norwich
Google Scholar
Pomraning KR, Smith KM, Freitag M (2009) Genome-wide high throughput analysis of DNA methylation in eukaryotes. Methods 47:142–150
Article PubMed CAS Google Scholar
Quail MA, Kozarewa I, Smith F et al (2008) A large genome center’s improvements to the Illumina sequencing system. Nat Methods 5:1005–1010
Article PubMed CAS Google Scholar
Cronn R, Liston A, Parks M et al (2008) Multiplex sequencing of plant chloroplast genomes using Solexa sequencing-by-synthesis technology. Nucleic Acids Res 36:e122
Article PubMed Google Scholar
Langmead B, Trapnell C, Pop M et al (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25
Article PubMed Google Scholar
Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829
Article PubMed CAS Google Scholar
Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18:1851–1858
Article PubMed CAS Google Scholar
Lin Y, Li J, Shen H et al (2011) Comparative studies of de novo assembly tools for next-generation sequencing technologies. Bioinformatics 27:2031–2037
Article PubMed CAS Google Scholar
Simpson JT, Wong K, Jackman SD et al (2009) ABySS: a parallel assembler for short read sequence data. Genome Res 19:1117–1123
Article PubMed CAS Google Scholar
Filichkin SA, Priest HD, Givan SA et al (2010) Genome-wide mapping of alternative splicing in Arabidopsis thaliana. Genome Res 20:45–58
Article PubMed CAS Google Scholar
Bryant DW Jr, Wong WK, Mockler TC (2009) QSRA: a quality-value guided de novo short read assembler. BMC Bioinformatics 10:69
Article PubMed Google Scholar
Li R, Zhu H, Ruan J et al (2010) De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 20:265–272
Article PubMed CAS Google Scholar
Boisvert S, Laviolette F, Corbeil J (2010) Ray: simultaneous assembly of reads from a mix of high-throughput sequencing technologies. J Comput Biol 17:1519–1533
Article PubMed CAS Google Scholar
Li R, Li Y, Kristiansen K et al (2008) SOAP: short oligonucleotide alignment program. Bioinformatics 24:713–714
Article PubMed CAS Google Scholar
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760
Article PubMed CAS Google Scholar
Fahlgren N, Sullivan CM, Kasschau KD et al (2009) Computational and analytical framework for small RNA profiling by high-throughput sequencing. RNA 15:992–1002
Article PubMed CAS Google Scholar
Lunter G, Goodson M (2011) Stampy: a statistical algorithm for sensitive and fast mapping of Illumina sequence reads. Genome Res 21:936–939
Article PubMed CAS Google Scholar
Li H, Handsaker B, Wysoker A et al (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25:2078–2079
Article PubMed Google Scholar
Smith KM, Phatale PA, Sullivan CM et al (2011) Heterochromatin is required for normal distribution of Neurospora CenH3. Mol Cell Biol 31:2528–2542
Article PubMed CAS Google Scholar
Smith KM, Sancar G, Dekhang R et al (2010) Transcription factors in light and circadian clock signaling networks revealed by genomewide mapping of direct targets for Neurospora White Collar Complex. Eukaryot Cell 9:1549–1556
Article PubMed CAS Google Scholar
Zhang Y, Liu T, Meyer CA et al (2008) Model-based analysis of ChIP-Seq (MACS). Genome Biol 9:R137
Article PubMed Google Scholar
Zang C, Schones DE, Zeng C et al (2009) A clustering approach for identification of enriched domains from histone modification ChIP-Seq data. Bioinformatics 25:1952–1958
Article PubMed CAS Google Scholar
Song Q, Smith AD (2011) Identifying dispersed epigenomic domains from ChIP-Seq data. Bioinformatics 27:870–871
Article PubMed CAS Google Scholar
Trapnell C, Pachter L, Salzberg SL (2009) TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25:1105–1111
Article PubMed CAS Google Scholar
Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11:R106
Article PubMed CAS Google Scholar
Trapnell C, Williams BA, Pertea G et al (2010) Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28:511–515
Article PubMed CAS Google Scholar
Singh D, Orellana CF, Hu Y et al (2011) FDM: a graph-based statistical method to detect differential transcription using RNA-seq data. Bioinformatics 27:2633–2640
Article PubMed CAS Google Scholar
Cumbie JS, Kimbrel JA, Di Y et al (2011) GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences. PLoS One 6:e25279
Article PubMed CAS Google Scholar
Aird D, Ross MG, Chen WS et al (2011) Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Genome Biol 12:R18
Article PubMed CAS Google Scholar

Download references

Acknowledgments

We thank Mark Dasenko, Chris Sullivan, Steve Drake, Matthew Peterson, and Scott Givan at the OSU CGRB core facility for assistance with Illumina sequencing, and Chris Sullivan, Jason Cumbie, Noah Fahlgren and Henry Priest for helpful discussions and sharing code. Work in our laboratory is supported by funds from the American Cancer Society (RSG-08-030-01-CCG), the National Institutes of Health (P01GM068087 and R01GM097637), and start-up funds from the OSU Computational and Genome Biology Initiative. The authors have no conflicting interests

Author information

Authors and Affiliations

Program for Molecular and Cellular Biology, Department of Biochemistry and Biophysics, Center for Genome Research and Biocomputing (CGRB), Oregon State University, Corvallis, OR, USA
Kyle R. Pomraning, Kristina M. Smith, Erin L. Bredeweg, Lanelle R. Connolly, Pallavi A. Phatale & Michael Freitag

Authors

Kyle R. Pomraning
View author publications
You can also search for this author in PubMed Google Scholar
Kristina M. Smith
View author publications
You can also search for this author in PubMed Google Scholar
Erin L. Bredeweg
View author publications
You can also search for this author in PubMed Google Scholar
Lanelle R. Connolly
View author publications
You can also search for this author in PubMed Google Scholar
Pallavi A. Phatale
View author publications
You can also search for this author in PubMed Google Scholar
Michael Freitag
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Freitag .

Editor information

Editors and Affiliations

, Dept. of Medical Microbiology & Immunolo, University of Wisconsin-Madison, Linden Drive 1550, Madison, 53706, Wisconsin, USA
Nancy P. Keller
, Dept. of Molecular Biology & Biotechnolo, University of Sheffield, Firth Court, Sheffield, S10 2TN, United Kingdom
Geoffrey Turner

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Pomraning, K.R., Smith, K.M., Bredeweg, E.L., Connolly, L.R., Phatale, P.A., Freitag, M. (2012). Library Preparation and Data Analysis Packages for Rapid Genome Sequencing. In: Keller, N., Turner, G. (eds) Fungal Secondary Metabolism. Methods in Molecular Biology, vol 944. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-122-6_1

Download citation

DOI: https://doi.org/10.1007/978-1-62703-122-6_1
Published: 08 September 2012
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-121-9
Online ISBN: 978-1-62703-122-6
eBook Packages: Springer Protocols

Publish with us

Policies and ethics