Assembling Genomes and Mini-metagenomes from Highly Chimeric Reads
Recent advances in single-cell genomics provide an alternative to gene-centric metagenomics studies, enabling whole genome sequencing of uncultivated bacteria. However, single-cell assembly projects are challenging due to (i) the highly non-uniform read coverage, and (ii) a greatly elevated number of chimeric reads and read pairs. While recently developed single-cell assemblers have addressed the former challenge, methods for assembling highly chimeric reads remain poorly explored. We present algorithms for identifying chimeric edges and resolving complex bulges in de Bruijn graphs, which significantly improve single-cell assemblies. We further describe applications of the single-cell assembler SPAdes to a new approach for capturing and sequencing “dark matter of life” that forms small pools of randomly selected single cells (called a mini-metagenome) and further sequences all genomes from the mini-metagenome at once. We demonstrate that SPAdes enables sequencing mini-metagenomes and benchmark it against various assemblers. On single-cell bacterial datasets, SPAdes improves on the recently developed E+V-SC and IDBA-UD assemblers specifically designed for single-cell sequencing. For standard (multicell) datasets, SPAdes also improves on A5, ABySS, CLC, EULER-SR, Ray, SOAPdenovo, and Velvet.
KeywordsDark Matter Directed Acyclic Graph Genome Length Human Microbiome False Edge
Unable to display preview. Download preview PDF.
- 4.Wylie, K.M., Truty, R.M., Sharpton, T.J., Mihindukulasuriya, K.A., Zhou, Y., et al.: Novel bacterial taxa in the human microbiome. PLoS ONE 7(6), e35294 (2012)Google Scholar
- 12.Li, K., Bihan, M., Yooseph, S., Methe, B.A.: Analyses of the microbial diversity across the human microbiome. PLoS ONE 7(6), e32118 (2012)Google Scholar
- 13.Tritt, A., Eisen, J.A., Facciotti, M.T., Darling, A.E.: An integrated pipeline for de novo assembly of microbial genomes. PLoS ONE 7(9), e42304 (2012)Google Scholar
- 20.Woyke, T., Xie, G., Copeland, A., González, J.M., Han, C., Kiss, H., Saw, J.H., Senin, P., Yang, C., Chatterji, S., Cheng, J.F., Eisen, J.A., Sieracki, M.E., Stepanauskas, R.: Assembling the marine metagenome, one cell at a time. PLoS ONE 4(4), e5299 (2009)Google Scholar
- 21.Ford, L.R., Fulkerson, D.R.: Flows in Networks. Princeton University Press (1962)Google Scholar
- 22.Gurevich, A., Saveliev, V., Vyahhi, N., Tesler, G.: QUAST: Quality Assessment for Genome Assemblies (2012) (submitted)Google Scholar
- 23.Woyke, T., Sczyrba, A., Lee, J., Rinke, C., Tighe, D., et al.: Decontamination of MDA reagents for single cell whole genome amplification. PLoS ONE 6(10), e26161 (2011)Google Scholar
- 25.Han, C., et al.: Complete genome sequence of Pedobacter heparinus type strain (HIM 762-3 T). Standards in Genomic Sciences 1(1) (2009)Google Scholar
- 27.Tindall, B., Sikorski, J., Lucas, S., Goltsman, E., Copeland, A., et al.: Complete genome sequence of Meiothermus ruber type strain (21 T). Standards in Genomic Sciences 3(1) (2010)Google Scholar