From de Bruijn Graphs to Rectangle Graphs for Genome Assembly
Jigsaw puzzles were originally constructed by painting a picture on a rectangular piece of wood and further cutting it into smaller pieces with a jigsaw. The Jigsaw Puzzle Problem is to find an arrangement of these pieces that fills up the rectangle in such a way that neighboring pieces have “matching” boundaries with respect to color and texture. While the general Jigsaw Puzzle Problem is NP-complete , we discuss its simpler version (called Rectangle Puzzle Problem) and study the rectangle graphs, recently introduced by Bankevich et al., 2012 , for assembling such puzzles. We establish the connection between Rectangle Puzzle Problem and the problem of assembling genomes from read-pairs, and further extend the analysis in  to real challenges encountered in applications of rectangle graphs in genome assembly. We demonstrate that addressing these challenges results in an assembler SPAdes+ that improves on existing assembly algorithms in the case of bacterial genomes (including particularly difficult case of genome assemblies from single cells).
SPAdes+ is freely available from http://bioinf.spbau.ru/spades .
KeywordsJigsaw Puzzle Fragment Assembly Puzzle Problem Eulerian Cycle Matching Side
Unable to display preview. Download preview PDF.
- 1.Aardenne-Ehrenfest, T., Bruijn, N.G.: Circuits and trees in oriented linear graphs. Classic papers in combinatorics, 149–163 (1987)Google Scholar
- 3.Bankevich, A., Nurk, S., Antipov, D., Gurevich, A.A., Dvorkin, M., Kulikov, A.S., Lesin, V.M., Nikolenko, S.I., Pham, S., Prjibelski, A.D., et al.: Spades: A new genome assembly algorithm and its applications to single-cell sequencing. Journal of Computational Biology 19(5), 455–477 (2012)CrossRefGoogle Scholar
- 8.Kampel, M., Sablatnig, R.: 3d puzzling of archeological fragments. In: Proc. of 9th Computer Vision Winter Workshop, vol. 2. Slovenian Pattern Recognition Society (2004)Google Scholar
- 10.Medvedev, P., Pham, S., Chaisson, M., Tesler, G., Pevzner, P.: Paired de bruijn graphs: A novel approach for incorporating mate pair information into genome assemblers. Journal of Computational Biology, 1625–1634 (2011)Google Scholar
- 11.Pevzner, P.A., Tang, H.: Fragment assembly with double-barreled data. Bioinformatics 17(suppl. 1), S225 (2001)Google Scholar