Massively Parallel Sequencing Approaches for Characterization of Structural Variation
The emergence of next-generation sequencing (NGS) technologies offers an incredible opportunity to comprehensively study DNA sequence variation in human genomes. Commercially available platforms from Roche (454), Illumina (Genome Analyzer and Hiseq 2000), and Applied Biosystems (SOLiD) have the capability to completely sequence individual genomes to high levels of coverage. NGS data is particularly advantageous for the study of structural variation (SV) because it offers the sensitivity to detect variants of various sizes and types, as well as the precision to characterize their breakpoints at base pair resolution. In this chapter, we present methods and software algorithms that have been developed to detect SVs and copy number changes using massively parallel sequencing data. We describe visualization and de novo assembly strategies for characterizing SV breakpoints and removing false positives.
Key wordsNext-generation sequencing Paired-end sequencing 454 Illumina Solexa Abi solid Insertions Deletions Duplications Inversions Translocations Indels Copy number variants
- 4.Drmanac, R., A.B. Sparks, M.J. Callow, et al. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science. 327(5961): p. 78–81.Google Scholar
- 11.Raphael, B.J., S. Volik, C. Collins, et al. (2003). Reconstructing tumor genome architectures. Bioinformatics. 19 Suppl 2: p. ii162–71.Google Scholar
- 16.Pleasance, E.D., P.J. Stephens, S. O’Meara, et al. A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature. 463(7278): p. 184–90.Google Scholar
- 17.Pleasance, E.D., R.K. Cheetham, P.J. Stephens, et al. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature. 463(7278): p. 191–6.Google Scholar
- 19.Li, H. and N. Homer A survey of sequence alignment algorithms for next-generation sequencing. Brief Bioinform.Google Scholar
- 22.Langmead, B., C. Trapnell, M. Pop, et al. (2009). Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10(3): p. R25.Google Scholar
- 23.Homer, N., B. Merriman, and S.F. Nelson (2009). BFAST: an alignment tool for large scale genome resequencing. PLoS One. 4(11): p. e7767.Google Scholar
- 24.Rumble, S.M., P. Lacroute, A.V. Dalca, et al. (2009). SHRiMP: accurate mapping of short color-space reads. PLoS Comput Biol. 5(5): p. e1000386.Google Scholar
- 26.Li, H. and R. Durbin Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics. 26(5): p. 589–95.Google Scholar
- 32.Koboldt, D.C. (2009). Short Read Aligners. MassGenomics. http://www.massgenomics.org/short-read-aligners.
- 39.Levin, J.Z., M.F. Berger, X. Adiconis, et al. (2009). Targeted next-generation sequencing of a cancer transcriptome enhances detection of sequence variants and novel fusion transcripts. Genome Biol. 10(10): p. R115.Google Scholar
- 40.Fiume, M., V. Williams, A. Brook, et al. Savant: genome browser for high-throughput sequencing data. Bioinformatics. 26(16): p. 1938–44.Google Scholar
- 43.Bashir, A., S. Volik, C. Collins, et al. (2008). Evaluation of paired-end sequencing strategies for detection of genome rearrangements in cancer. PLoS Comput Biol. 4(4): p. e1000051.Google Scholar