Abstract
De novo transcriptome assembly is an important approach in RNA-Seq data analysis and it can help us to reconstruct the transcriptome and investigate gene expression profiles without reference genome sequences. We carried out transcriptome assemblies with two RNA-Seq datasets generated from human brain and cell line, respectively. We then determined an efficient way to yield an optimal overall assembly using three different strategies. We first assembled brain and cell line transcriptome using a single k-mer length. Next we tested a range of values of k-mer length and coverage cutoff in assembling. Lastly, we combined the assembled contigs from a range of k values to generate a final assembly. By comparing these assembly results, we found that using only one k-mer value for assembly is not enough to generate good assembly results, but combining the contigs from different k-mer values could yield longer contigs and greatly improve the overall assembly.
Article PDF
Similar content being viewed by others
References
Marioni J C, Mason C E, Mane S M, et al. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res, 2008, 18: 1509–1517
Sultan M, Schulz M H, Richard H, et al. A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science, 2008, 321: 956–960
Mortazavi A, Williams B A, McCue K, et al. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods, 2008, 5: 621–628
Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet, 2009, 10: 57–63
Maher C A, Kumar-Sinha C, Cao X, et al. Transcriptome sequencing to detect gene fusions in cancer. Nature, 2009, 458: 97–101
Chepelev I, Wei G, Tang Q, et al. Detection of single nucleotide variations in expressed exons of the human genome using RNA-Seq. Nucleic Acids Res, 2009, 37: e106
Nagalakshmi U, Waern K, Snyder M. RNA-Seq: a method for comprehensive transcriptome analysis. Curr Protoc Mol Biol, 2010, Chapter 4: Unit 4.11. 1–13
Pflueger D, Terry S, Sboner A, et al. Discovery of non-ETS gene fusions in human prostate cancer using next-generation RNA sequencing. Genome Res, 2011, 21: 56–67
Chen G, Yin K, Shi L, et al. Comparative analysis of human protein-coding and noncoding RNAs between brain and 10 mixed cell lines by RNA-Seq. PLoS ONE, 2011, 6: e28318
Chen G, Li R, Shi L, et al. Revealing the missing expressed genes beyond the human reference genome by RNA-Seq. BMC Genomics, 2011, 12: 590
Zerbino D R, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res, 2008, 18: 821–829
Birol I, Jackman S D, Nielsen C B, et al. De novo transcriptome assembly with ABySS. Bioinformatics, 2009, 25: 2872–2877
Robertson G, Schein J, Chiu R, et al. De novo assembly and analysis of RNA-seq data. Nat Methods, 2010, 7: 909–912
Grabherr M G, Haas B J, Yassour M, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol, 2011, 29: 644–652
Chitsaz H, Yee-Greenbaum J L, Tesler G, et al. Efficient de novo assembly of single-cell bacterial genomes from short-read data sets. Nat Biotechnol, 2011, 29: 915–921
Shi L, Reid L H, Jones W D, et al. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol, 2006, 24: 1151–1161
Novoradovskaya N, Whitfield M L, Basehore L S, et al. Universal Reference RNA as a standard for microarray experiments. BMC Genomics, 2004, 5: 20
Kent W J. BLAT-the BLAST-like alignment tool. Genome Res, 2002, 12: 656–664
Trapnell C, Williams B A, Pertea G, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol, 2010, 28: 511–515
Guttman M, Garber M, Levin J Z, et al. Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol, 2010, 28: 503–10
Zhou X, Ren L, Li Y, et al. The next-generation sequencing technology: a technology review and future perspective. Sci China Life Sci, 2010, 53: 44–57
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is published with open access at Springerlink.com
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Chen, G., Yin, K., Wang, C. et al. De novo transcriptome assembly of RNA-Seq reads with different strategies. Sci. China Life Sci. 54, 1129–1133 (2011). https://doi.org/10.1007/s11427-011-4256-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11427-011-4256-9