Skip to main content

Mapping-Free and Assembly-Free Discovery of Inversion Breakpoints from Raw NGS Reads

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNBI,volume 8542)

Abstract

We propose a formal model and an algorithm for detecting inversion breakpoints without a reference genome, directly from raw NGS data. This model is characterized by a fixed size topological pattern in the de Bruijn Graph. We describe precisely the possible sources of false positives and false negatives and we additionally propose a sequence-based filter giving a good trade-off between precision and recall of the method. We implemented these ideas in a prototype called TakeABreak. Applied on simulated inversions in genomes of various complexity (from E. coli to a human chromosome dataset), TakeABreak provided promising results with a low memory footprint and a small computational time.

Keywords

  • structural variant
  • NGS
  • reference-free
  • de Bruijn graph

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-07953-0_10
  • Chapter length: 12 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   44.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-07953-0
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   59.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alkan, C., Coe, B.P., Eichler, E.E.: Genome structural variation discovery and genotyping. Nat Rev. Genet. 12, 363–376 (2011)

    CrossRef  Google Scholar 

  2. Chikhi, R., Rizk, G.: Space-efficient and exact de bruijn graph representation based on a bloom filter. Algorithms for Molecular Biology 8, 22 (2013)

    CrossRef  Google Scholar 

  3. Drezen, E., et al.: The Genome Assembly and Analysis Tool Box, http://gatb.inria.fr/ (Manuscript in Prep. 2014)

  4. Iqbal, Z., Caccamo, M., Turner, I., Flicek, P., McVean, G.: De novo assembly and genotyping of variants using colored de bruijn graphs. Nature Genetics 44, 226–232 (2012)

    CrossRef  Google Scholar 

  5. Lemaitre, C., et al.: MindTheGap Software, http://mindthegap.genouest.org/ (Manuscript in Prep. 2014)

  6. Li, Y., Zheng, H., Luo, R., Wu, H., Zhu, H., Li, R., et al.: Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly. Nat. Biotechnol. 29, 723–730 (2011)

    CrossRef  Google Scholar 

  7. Medvedev, P., Stanciu, M., Brudno, M.: Computational methods for discovering structural variation with next-generation sequencing. Nat Methods 6, S13–S20 (2009)

    Google Scholar 

  8. Mills, R.E., Walter, K., Stewart, C., Handsaker, R.E.: 1000 Genomes Project: Mapping copy number variation by population-scale genome sequencing. Nature 470, 59–65 (2011)

    CrossRef  Google Scholar 

  9. Nordström, K.J.V., Albani, M.C., James, G.V., et al.: Mutation identification by direct comparison of whole-genome sequencing data from mutant and wild-type individuals using k-mers. Nature Biotechnology 31, 325–330 (2013)

    CrossRef  Google Scholar 

  10. Peterlongo, P., Schnel, N., Pisanti, N., Sagot, M.-F., Lacroix, V.: Identifying sNPs without a reference genome by comparing raw reads. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 147–158. Springer, Heidelberg (2010)

    CrossRef  Google Scholar 

  11. Sacomoto, G.A., Kielbassa, J., Chikhi, R., Uricaru, R., et al.: Kissplice: de-novo calling alternative splicing events from rna-seq data. BMC Bioinformatics 13, S5 (2012)

    Google Scholar 

  12. Salikhov, K., Sacomoto, G., Kucherov, G.: Using Cascading Bloom Filters to Improve the Memory Usage for de Brujin Graphs. In: Darling, A., Stoye, J. (eds.) WABI 2013. LNCS, vol. 8126, pp. 364–376. Springer, Heidelberg (2013)

    CrossRef  Google Scholar 

  13. Uricaru, R., et al.: discoSnp Software, http://colibread.inria.fr/discosnp/ (Manuscript in Prep. 2014)

  14. Zerbino, D.R., Birney, E.: Velvet: algorithms for de novo short read assembly using de bruijn graphs. Genome Research 18, 821–829 (2008)

    CrossRef  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Lemaitre, C., Ciortuz, L., Peterlongo, P. (2014). Mapping-Free and Assembly-Free Discovery of Inversion Breakpoints from Raw NGS Reads. In: Dediu, AH., Martín-Vide, C., Truthe, B. (eds) Algorithms for Computational Biology. AlCoB 2014. Lecture Notes in Computer Science(), vol 8542. Springer, Cham. https://doi.org/10.1007/978-3-319-07953-0_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07953-0_10

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07952-3

  • Online ISBN: 978-3-319-07953-0

  • eBook Packages: Computer ScienceComputer Science (R0)