Chapter

Algorithms for Computational Biology

Volume 8542 of the series Lecture Notes in Computer Science pp 119-130

Mapping-Free and Assembly-Free Discovery of Inversion Breakpoints from Raw NGS Reads

  • Claire LemaitreAffiliated withINRIA/IRISA/GenScale
  • , Liviu CiortuzAffiliated withINRIA/IRISA/GenScaleFaculty of Computer Science Iasi
  • , Pierre PeterlongoAffiliated withINRIA/IRISA/GenScale

* Final gross prices may vary according to local VAT.

Get Access

Abstract

We propose a formal model and an algorithm for detecting inversion breakpoints without a reference genome, directly from raw NGS data. This model is characterized by a fixed size topological pattern in the de Bruijn Graph. We describe precisely the possible sources of false positives and false negatives and we additionally propose a sequence-based filter giving a good trade-off between precision and recall of the method. We implemented these ideas in a prototype called TakeABreak. Applied on simulated inversions in genomes of various complexity (from E. coli to a human chromosome dataset), TakeABreak provided promising results with a low memory footprint and a small computational time.

Keywords

structural variant NGS reference-free de Bruijn graph