From de Bruijn Graphs to Rectangle Graphs for Genome Assembly

  • Nikolay Vyahhi
  • Alex Pyshkin
  • Son Pham
  • Pavel A. Pevzner
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7534)

Abstract

Jigsaw puzzles were originally constructed by painting a picture on a rectangular piece of wood and further cutting it into smaller pieces with a jigsaw. The Jigsaw Puzzle Problem is to find an arrangement of these pieces that fills up the rectangle in such a way that neighboring pieces have “matching” boundaries with respect to color and texture. While the general Jigsaw Puzzle Problem is NP-complete [6], we discuss its simpler version (called Rectangle Puzzle Problem) and study the rectangle graphs, recently introduced by Bankevich et al., 2012 [3], for assembling such puzzles. We establish the connection between Rectangle Puzzle Problem and the problem of assembling genomes from read-pairs, and further extend the analysis in [3] to real challenges encountered in applications of rectangle graphs in genome assembly. We demonstrate that addressing these challenges results in an assembler SPAdes+ that improves on existing assembly algorithms in the case of bacterial genomes (including particularly difficult case of genome assemblies from single cells).

SPAdes+ is freely available from http://bioinf.spbau.ru/spades.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aardenne-Ehrenfest, T., Bruijn, N.G.: Circuits and trees in oriented linear graphs. Classic papers in combinatorics, 149–163 (1987)Google Scholar
  2. 2.
    Abrham, J., Kotzig, A.: Transformations of euler tours. Annals of Discrete Mathematics 8, 65–69 (1980)MathSciNetMATHCrossRefGoogle Scholar
  3. 3.
    Bankevich, A., Nurk, S., Antipov, D., Gurevich, A.A., Dvorkin, M., Kulikov, A.S., Lesin, V.M., Nikolenko, S.I., Pham, S., Prjibelski, A.D., et al.: Spades: A new genome assembly algorithm and its applications to single-cell sequencing. Journal of Computational Biology 19(5), 455–477 (2012)CrossRefGoogle Scholar
  4. 4.
    Chaisson, M.J., Pevzner, P.A.: Short read fragment assembly of bacterial genomes. Genome Research 18(2), 324 (2008)CrossRefGoogle Scholar
  5. 5.
    Chitsaz, H., Yee-Greenbaum, J.L., Tesler, G., et al.: Efficient de novo assembly of single-cell bacterial genomes from short-read data sets. Nat. Biotechnol. 29(10), 915–921 (2011)CrossRefGoogle Scholar
  6. 6.
    Demaine, E.D., Demaine, M.L.: Jigsaw puzzles, edge matching, and polyomino packing: Connections and complexity. Graphs and Combinatorics 23, 195–208 (2007)MathSciNetMATHCrossRefGoogle Scholar
  7. 7.
    Idury, R.M., Waterman, M.S.: A new algorithm for DNA sequence assembly. Journal of Computational Biology 2(2), 291–306 (1995)CrossRefGoogle Scholar
  8. 8.
    Kampel, M., Sablatnig, R.: 3d puzzling of archeological fragments. In: Proc. of 9th Computer Vision Winter Workshop, vol. 2. Slovenian Pattern Recognition Society (2004)Google Scholar
  9. 9.
    Li, R., Zhu, H., Ruan, J., Qian, W., Fang, X., Shi, Z., Li, Y., Li, S., Shan, G., Kristiansen, K., et al.: De novo assembly of human genomes with massively parallel short read sequencing. Genome Research 20(2), 265 (2010)CrossRefGoogle Scholar
  10. 10.
    Medvedev, P., Pham, S., Chaisson, M., Tesler, G., Pevzner, P.: Paired de bruijn graphs: A novel approach for incorporating mate pair information into genome assemblers. Journal of Computational Biology, 1625–1634 (2011)Google Scholar
  11. 11.
    Pevzner, P.A., Tang, H.: Fragment assembly with double-barreled data. Bioinformatics 17(suppl. 1), S225 (2001)Google Scholar
  12. 12.
    Pham, S.K., Antipov, D., Sirotkin, A., Tesler, G., Pevzner, P.A., Alekseyev, M.A.: Pathset Graphs: A Novel Approach for Comprehensive Utilization of Paired Reads in Genome Assembly. In: Chor, B. (ed.) RECOMB 2012. LNCS, vol. 7262, pp. 200–212. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  13. 13.
    Zerbino, D.R., Birney, E.: Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Research 18(5), 821 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Nikolay Vyahhi
    • 1
  • Alex Pyshkin
    • 1
  • Son Pham
    • 2
  • Pavel A. Pevzner
    • 1
    • 2
  1. 1.Algorithmic Biology LaboratorySt. Petersburg Academic UniversityRussia
  2. 2.Department of Computer Science and EngineeringUCSDLa JollaUSA

Personalised recommendations