Using Cascading Bloom Filters to Improve the Memory Usage for de Brujin Graphs

  • Kamil Salikhov
  • Gustavo Sacomoto
  • Gregory Kucherov
Conference paper

DOI: 10.1007/978-3-642-40453-5_28

Part of the Lecture Notes in Computer Science book series (LNCS, volume 8126)
Cite this paper as:
Salikhov K., Sacomoto G., Kucherov G. (2013) Using Cascading Bloom Filters to Improve the Memory Usage for de Brujin Graphs. In: Darling A., Stoye J. (eds) Algorithms in Bioinformatics. WABI 2013. Lecture Notes in Computer Science, vol 8126. Springer, Berlin, Heidelberg

Abstract

De Brujin graphs are widely used in bioinformatics for processing next-generation sequencing (NGS) data. Due to the very large size of NGS datasets, it is essential to represent de Bruijn graphs compactly, and several approaches to this problem have been proposed recently. In this work, we show how to reduce the memory required by the algorithm of Chikhi and Rizk (WABI, 2012) that represents de Brujin graphs using Bloom filters. Our method requires 30% to 40% less memory with respect to their method, with insignificant impact to construction time. At the same time, our experiments showed a better query time compared to their method. This is, to our knowledge, the best practical representation for de Bruijn graphs.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Kamil Salikhov
    • 1
  • Gustavo Sacomoto
    • 2
    • 3
  • Gregory Kucherov
    • 4
    • 5
  1. 1.Lomonosov Moscow State UniversityMoscowRussia
  2. 2.INRIA Grenoble Rhône-AlpesFrance
  3. 3.Laboratoire Biométrie et Biologie EvolutiveUniversité Lyon 1LyonFrance
  4. 4.Department of Computer ScienceBen-Gurion University of the NegevBe’er ShevaIsrael
  5. 5.Laboratoire d’Informatique Gaspard MongeUniversité Paris-Est & CNRSParisFrance

Personalised recommendations