Sprites2: Detection of Deletions Based on an Accurate Alignment Strategy

  • Zhen Zhang
  • Jianxin WangEmail author
  • Junwei Luo
  • Juan Shang
  • Min Li
  • Fang-Xiang Wu
  • Yi Pan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10847)


Since humans are diploid organisms, homozygous and heterozygous deletions are ubiquitous in the human genome. How to distinguish homozygous and heterozygous deletions is an important issue for current structural variation detection tools. Additionally, due to the problems of sequencing errors, micro-homologies and micro-insertions, breakpoint locations identified with common alignment tools which use greedy strategy may not be the true deletion locations, and usually lead to false structural variation detections. In this paper, we propose a deletion detection method called Sprites2. Comparing with Sprites, Sprites2 adds the following novel function modules: (1) Sprites2 takes advantage of the variance of insert size distribution to determine the type of deletions which can enhance the accuracy of deletion calls; (2) Sprites2 uses a novel alignment strategy based on AGE (one algorithm aligning 5’ and 3’ ends between two sequences simultaneously) to locate breakpoints which can solve the problems introduced by sequencing errors, micro-homologies and micro-insertions. For testing the performance of Sprites2, simulated and real datasets are used in our experiments, and some popular structural variation detection tools are compared with Sprites2. The experimental results show that Sprites2 can improve deletion detection performance. Sprites2 is publicly available at


Structural variation Deletion detection Alignment strategy Sequence analysis 



This work was supported in part by the National Natural Science Foundation of China under Grant No. 61732009, No. 61622213, No. 61728211, No. 61772552, No. 61772557 and No. 61602156.


  1. 1.
    Guan, P., Sung, W.K.: Structural variation detection using next-generation sequencing data: a comparative technical review. Methods 102, 36–49 (2016)CrossRefGoogle Scholar
  2. 2.
    Weischenfeldt, J., Symmons, O., Spitz, F., Korbel, J.O.: Phenotypic impact of genomic structural variation: insights from and for human disease. Nat. Rev. Genet. 14(2), 125–138 (2013)CrossRefGoogle Scholar
  3. 3.
    Alkan, C., Coe, B.P., Eichler, E.E.: Genome structural variation discovery and genotyping. Nat. Rev. Genet. 12(5), 363–376 (2011)CrossRefGoogle Scholar
  4. 4.
    Ye, K., Schulz, M.H., Long, Q., Apweiler, R., Ning, Z.: Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25(21), 2865–2871 (2009)CrossRefGoogle Scholar
  5. 5.
    Rausch, T., Zichner, T., Schlattl, A., Sttz, A.M., Benes, V., Korbel, J.O.: DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28(18), 333–339 (2012)CrossRefGoogle Scholar
  6. 6.
    Zhang, J., Wang, J., Wu, Y.: An improved approach for accurate and efficient calling of structural variations with low-coverage sequence data. BMC Bioinf. 13(S6), S6 (2012)CrossRefGoogle Scholar
  7. 7.
    Zhang, Z., Wang, J., Luo, J., Ding, X., Zhong, J., Wang, J., Wu, F., Pan, Y., et al.: Sprites: detection of deletions from sequencing data by re-aligning split reads. Bioinformatics 32(12), 1788–1796 (2016)CrossRefGoogle Scholar
  8. 8.
    Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R.: The sequence alignment/map format and SAMtools. Bioinformatics 25(16), 2078–2079 (2009)CrossRefGoogle Scholar
  9. 9.
    Abyzov, A., Gerstein, M.: AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision. Bioinformatics 27(5), 595–603 (2011)CrossRefGoogle Scholar
  10. 10.
    Layer, R.M., Chiang, C., Quinlan, A.R., Hall, I.M.: LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 15(6), R84 (2014)CrossRefGoogle Scholar
  11. 11.
    Faust, G.G., Hall, I.M.: YAHA: fast and flexible long-read alignment with optimal breakpoint detection. Bioinformatics 28(19), 2417–2424 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Zhen Zhang
    • 1
  • Jianxin Wang
    • 1
    Email author
  • Junwei Luo
    • 1
    • 2
  • Juan Shang
    • 1
  • Min Li
    • 1
  • Fang-Xiang Wu
    • 3
  • Yi Pan
    • 4
  1. 1.School of Information Science and EngineeringCentral South UniversityChangshaChina
  2. 2.College of Computer Science and TechnologyHenan Polytechnic UniversityJiaozuoChina
  3. 3.Division of Biomedical Engineering and Department of Mechanical EngineeringUniversity of SaskatchewanSaskatoonCanada
  4. 4.Department of Computer ScienceGeorgia State UniversityAtlantaUSA

Personalised recommendations