Global Assembly of Expressed Sequence Tags

Purchase on Springer.com

$49.95 / €39.95 / £34.95*

* Final gross prices may vary according to local VAT.

Get Access

Abstract

The method for the construction of Expressed Sequence Tag (EST) assemblies described here uses reads generated from 454 pyrosequencing and Sanger and Illumina (Solexa) sequencing technologies as input. It is consistent with and parallels many established EST assembly protocols, for example the TIGR Gene Indices. Reads that are used as input to the EST assembly process usually come from both internal and external sources. Thus, in addition to internally generated EST reads, expressed transcripts are collected from dbEST and also the NCBI GenBank nucleotide database (full-length and partial cDNAs). “Virtual” transcript sequences derived from whole genome annotation projects can be excluded, depending on the needs of the project. Currently, in most cases, 454-derived sequences can be treated similar to Sanger-derived ESTs. In contrast, the shorter Solexa-derived sequences will have to undergo a round of either de novo assembly or an “align-then-assemble” approach against a reference genome, if available, before these transcripts can be used for the purpose of a global EST assembly that combines a mixture of Sanger and next-generation sequencing technologies.