SVEM: A Structural Variant Estimation Method Using Multi-mapped Reads on Breakpoints
Recent development of next generation sequencing (NGS) technologies has led to the identification of structural variants (SVs) of genomic DNA existing in the human population. Several SV detection methods utilizing NGS data have been proposed. However, there are several difficulties in analysis of NGS data, particularly with regard to handling reads from duplicated loci or low-complexity sequences of the human genome. In this paper, we propose SVEM, a novel statistical method to detect SVs with a single nucleotide resolution that can utilize multi-mapped reads on breakpoints. SVEM estimates the amount of reads on breakpoints as parameters and mapping states as latent variables using the expectation maximization algorithm. This framework enables us to handle ambiguous mapping of reads without discarding information for SV detection. SVEM is applied to simulation data and real data, and it achieves better performance than existing methods in terms of precision and recall.
KeywordsReference Genome Copy Number Change Next Generation Sequencing Data Single Nucleotide Poly Single Nucleotide Resolution
Unable to display preview. Download preview PDF.
- 5.Hoogendoorn, E.: Computational methods for the detection of structural variation in the human genome (2012)Google Scholar
- 7.Hehir-Kwa, J.Y., Egmont-Petersen, M., Janssen, I.M., Smeets, D., Van Kessel, A.G., Veltman, J.A.: Genome-wide copy number profiling on high-density bacterial artificial chromosomes, single-nucleotide polymorphisms, and oligonucleotide microarrays: a platform comparison based on statistical power analysis. DNA Res. 14(1), 1–11 (2007)CrossRefGoogle Scholar
- 8.Miller, D.T., Adam, M.P., Aradhya, S., Biesecker, L.G., Brothman, A.R., Carter, N.P., et al.: Consensus statement: chromosomal microarray is a first-tier clinical diagnostic test for individuals with developmental disabilities or congenital anomalies. Am. J. Hum. Genet. 86(5), 749–764 (2010)CrossRefGoogle Scholar
- 12.Rausch, T., Zichner, T., Schlattl, A., Stütz, A.M., Benes, V., Korbel, J.O.: DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28(18), i333–i339 (2012)Google Scholar
- 17.Abecasis, G.R., Auton, A., Brooks, L.D., DePristo, M.A., Durbin, R.M., Handsaker, R.E., Kang, H.M., Marth, G.T., McVean, G.A.: An integrated map of genetic variation from 1,092 human genomes. Nature 491(7422), 56–65 (2012) (1000 Genomes Project Consortium)Google Scholar
- 24.Mimori, T., Nariai, N., Kojima, K., Takahashi, M., Ono, A., Sato, Y., Yamaguchi-Kabata, Y., Nagasaki, M.: iSVP: an integrated structural variant calling pipeline from high-throughput sequencing data. BMC Systems Biology 7(6), 1–8 (2013)Google Scholar