Constructing String Graphs in External Memory
In this paper we present an efficient external memory algorithm to compute the string graph from a collection of reads, which is a fundamental data representation used for sequence assembly.
Our algorithm builds upon some recent results on lightweight Burrows-Wheeler Transform (BWT) and Longest Common Prefix (LCP) construction providing, as a by-product, an efficient procedure to extend intervals of the BWT that could be of independent interest.
We have implemented our algorithm and compared its efficiency against SGA—the most advanced assembly string graph construction program.
KeywordsMain Memory External Memory Secondary Memory Graph Reduction Splice Graph
Unable to display preview. Download preview PDF.
- 8.Lam, T., Li, R., Tam, A., Wong, S., Wu, E., Yiu, S.: High throughput short read alignment via bi-directional BWT. In: BIBM 2009, pp. 31–36 (2009)Google Scholar
- 9.Myers, E.: The fragment assembly string graph. Bioinformatics 21, ii79–ii85 (2005)Google Scholar
- 13.Simpson, J., Durbin, R.: Efficient construction of an assembly string graph using the FM-index. Bioinformatics 26(12), i367–i373 (2010)Google Scholar