Expressed Sequence Tags (ESTs) pp 1-12 | Cite as
Expressed Sequence Tags: An Overview
Abstract
Expressed sequence tags (ESTs) are fragments of mRNA sequences derived through single sequencing reactions performed on randomly selected clones from cDNA libraries. To date, over 45 million ESTs have been generated from over 1400 different species of eukaryotes. For the most part, EST projects are used to either complement existing genome projects or serve as low-cost alternatives for purposes of gene discovery. However, with improvements in accuracy and coverage, they are beginning to find application in fields such as phylogenetics, transcript profiling and proteomics. This volume provides practical details on the generation and analysis of ESTs. Chapters are presented which cover creation of cDNA libraries; generation and processing of sequence data; bioinformatics analysis of ESTs; and their application to phylogenetics and transcript profiling.
Key words
Expressed sequence tags EST cDNA libraries database resources dbEST gene discovery transcript profilingReferences
- 1.Putney, S. D., Herlihy, W. C., and Schimmel, P. (1983) A new troponin T and cDNA clones for 13 different muscle proteins, found by shotgun sequencing. Nature 302, 718–21.PubMedCrossRefGoogle Scholar
- 2.Adams, M. D., Kelley, J. M., Gocayne, J. D., Dubnick, M., Polymeropoulos, M. H., Xiao, H., Merril, C. R., Wu, A., Olde, B., Moreno, R. F., et al. (1991) Complementary DNA sequencing: expressed sequence tags and human genome project. Science 252, 1651–6.PubMedCrossRefGoogle Scholar
- 3.Boguski, M. S., Lowe, T. M., and Tolstoshev, C. M. (1993) dbEST-database for “expressed sequence tags”. Nat Genet 4, 332–3.PubMedCrossRefGoogle Scholar
- 4.Bernal, A., Ear, U., and Kyrpides, N. (2001) Genomes OnLine Database (GOLD): a monitor of genome projects world-wide. Nucleic Acids Res 29, 126–7.PubMedCrossRefGoogle Scholar
- 5.Ranjit, N., Jones, M. K., Stenzel, D. J., Gasser, R. B., and Loukas, A. (2006) A survey of the intestinal transcriptomes of the hookworms, Necator americanus and Ancylostoma caninum, using tissues isolated by laser microdissection microscopy. Int J Parasitol 36, 701–10.PubMedCrossRefGoogle Scholar
- 6.Parkinson, J., Anthony, A., Wasmuth, J., Schmid, R., Hedley, A., and Blaxter, M. (2004) PartiGene – constructing partial genomes. Bioinformatics 20, 1398–404.PubMedCrossRefGoogle Scholar
- 7.Wasmuth, J. D., and Blaxter, M. L. (2004) prot4EST: translating expressed sequence tags from neglected genomes. BMC Bioinformatics 5, 187.PubMedCrossRefGoogle Scholar
- 8.Koski, L. B., Gray, M. W., Lang, B. F., and Burger, G. (2005) AutoFACT: an automatic functional annotation and classification tool. BMC Bioinformatics 6, 151.PubMedCrossRefGoogle Scholar
- 9.Eddy, S. R. (1998) Profile hidden Markov models. Bioinformatics 14, 755–63.PubMedCrossRefGoogle Scholar
- 10.Parkinson, J., Whitton, C., Schmid, R., Thomson, M., and Blaxter, M. (2004) NEMBASE: a resource for parasitic nematode ESTs. Nucleic Acids Res 32, D427–30.PubMedCrossRefGoogle Scholar
- 11.Wylie, T., Martin, J. C., Dante, M., Mitreva, M. D., Clifton, S. W., Chinwalla, A., Waterston, R. H., Wilson, R. K., and McCarter, J. P. (2004) Nematode.net: a tool for navigating sequences from parasitic and free-living nematodes. Nucleic Acids Res 32, D423–6.PubMedCrossRefGoogle Scholar
- 12.Wheeler, D. L., Church, D. M., Federhen, S., Lash, A. E., Madden, T. L., Pontius, J. U., Schuler, G. D., Schriml, L. M., Sequeira, E., Tatusova, T. A., and Wagner, L. (2003) Database resources of the National Center for Biotechnology. Nucleic Acids Res 31, 28–33.PubMedCrossRefGoogle Scholar
- 13.Lee, Y., Tsai, J., Sunkara, S., Karamycheva, S., Pertea, G., Sultana, R., Antonescu, V., Chan, A., Cheung, F., and Quackenbush, J. (2005) The TIGR Gene Indices: clustering and assembling EST and known genes and integration with eukaryotic genomes. Nucleic Acids Res 33, D71–4.PubMedCrossRefGoogle Scholar
- 14.Peregrin-Alvarez, J. M., Yam, A., Sivakumar, G., and Parkinson, J. (2005) PartiGeneDB – collating partial genomes. Nucleic Acids Res 33, D303–7.PubMedCrossRefGoogle Scholar
- 15.Parkinson, J., and Blaxter, M. (2003) SimiTri – visualizing similarity relationships for groups of sequences. Bioinformatics 19, 390–5.PubMedCrossRefGoogle Scholar
- 16.Kenyon, F., Welsh, M., Parkinson, J., Whitton, C., Blaxter, M. L., and Knox, D. P. (2003) Expressed sequence tag survey of gene expression in the scab mite Psoroptes ovis – allergens, proteases and free-radical scavengers. Parasitology 126, 451–60.PubMedCrossRefGoogle Scholar
- 17.Fernandez, C., Gregory, W. F., Loke, P., and Maizels, R. M. (2002) Full-length-enriched cDNA libraries from Echinococcus granulosus contain separate populations of oligo-capped and trans-spliced transcripts and a high level of predicted signal peptide sequences. Mol Biochem Parasitol 122, 171–80.PubMedCrossRefGoogle Scholar
- 18.Luo, M., Dang, P., Guo, B. Z., He, G., Holbrook, C. C., Bausher, M. G., and Lee, R. D. (2005) Generation of Expressed Sequence Tags (ESTs) for gene discovery and marker development in cultivated peanut. Crop Sci 45, 346–53.CrossRefGoogle Scholar
- 19.Wong, C. E., Li, Y., Whitty, B. R., Diaz-Camino, C., Akhter, S. R., Brandle, J. E., Golding, G. B., Weretilnyk, E. A., Moffatt, B. A., and Griffith, M. (2005) Expressed sequence tags from the Yukon ecotype of Thellungiella reveal that gene expression in response to cold, drought and salinity shows little overlap. Plant Mol Biol 58, 561–74.PubMedCrossRefGoogle Scholar
- 20.Jenny, M. J., Ringwood, A. H., Lacy, E. R., Lewitus, A. J., Kempton, J. W., Gross, P. S., Warr, G. W., and Chapman, R. W. (2002) Potential indicators of stress response identified by expressed sequence tag analysis of hemocytes and embryos from the American oyster, Crassostrea virginica. Mar Biotechnol (NY) 4, 81–93.CrossRefGoogle Scholar
- 21.Sturzenbaum, S., Parkinson, J., Blaxter, M., Morgan, A., Kille, P., and Georgiev, O. (2003) The earthworm EST sequencing project. Pedobiologia 47, 447–51.Google Scholar
- 22.Li, L., Brunk, B. P., Kissinger, J. C., Pape, D., Tang, K., Cole, R. H., Martin, J., Wylie, T., Dante, M., Fogarty, S. J., Howe, D. K., Liberator, P., Diaz, C., and erson, J., White, M., Jerome, M. E., Johnson, E. A., Radke, J. A., Stoeckert, C. J., Jr., Waterston, R. H., Clifton, S. W., Roos, D. S., and Sibley, L. D. (2003) Gene discovery in the apicomplexa as revealed by EST sequencing and assembly of a comparative gene database. Genome Res 13, 443–54.PubMedCrossRefGoogle Scholar
- 23.Hughes, J., Longhorn, S. J., Papadopoulou, A., Theodorides, K., de Riva, A., Mejia-Chang, M., Foster, P. G., and Vogler, A. P. (2006) Dense taxonomic EST sampling and its applications for molecular systematics of the Coleoptera (beetles). Mol Biol Evol 23, 268–78.PubMedCrossRefGoogle Scholar
- 24.Parkinson, J., Mitreva, M., Whitton, C., Thomson, M., Daub, J., Martin, J., Schmid, R., Hall, N., Barrell, B., Waterston, R. H., McCarter, J. P., and Blaxter, M. L. (2004) A transcriptomic analysis of the phylum Nematoda. Nat Genet 36, 1259–67.PubMedCrossRefGoogle Scholar
- 25.Ghedin, E., Wang, S., Spiro, D., Caler, E., Zhao, Q., Crabtree, J., Allen, J. E., Delcher, A. L., Guiliano, D. B., Miranda-Saavedra, D., Angiuoli, S. V., Creasy, T., Amedeo, P., Haas, B., El-Sayed, N. M., Wortman, J. R., Feldblyum, T., Tallon, L., Schatz, M., Shumway, M., Koo, H., Salzberg, S. L., Schobel, S., Pertea, M., Pop, M., White, O., Barton, G. J., Carlow, C. K., Crawford, M. J., Daub, J., Dimmic, M. W., Estes, C. F., Foster, J. M., Ganatra, M., Gregory, W. F., Johnson, N. M., Jin, J., Komuniecki, R., Korf, I., Kumar, S., Laney, S., Li, B. W., Li, W., Lindblom, T. H., Lustigman, S., Ma, D., Maina, C. V., Martin, D. M., McCarter, J. P., McReynolds, L., Mitreva, M., Nutman, T. B., Parkinson, J., Peregrin-Alvarez, J. M., Poole, C., Ren, Q., Saunders, L., Sluder, A. E., Smith, K., Stanke, M., Unnasch, T. R., Ware, J., Wei, A. D., Weil, G., Williams, D. J., Zhang, Y., Williams, S. A., Fraser-Liggett, C., Slatko, B., Blaxter, M. L., and Scott, A. L. (2007) Draft genome of the filarial nematode parasite Brugia malayi. Science 317, 1756–60.PubMedCrossRefGoogle Scholar
- 26.LoVerde, P. T., Hirai, H., Merrick, J. M., Lee, N. H., and El-Sayed, N. (2004) Schistosoma mansoni genome project: an update. Parasitol Int 53, 183–92.PubMedCrossRefGoogle Scholar
- 27.Philippe, H., Lartillot, N., and Brinkmann, H. (2005) Multigene analyses of bilaterian animals corroborate the monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia. Mol Biol Evol 22, 1246–53.PubMedCrossRefGoogle Scholar
- 28.Peregrin-Alvarez, J. M., and Parkinson, J. (2007) The global landscape of sequence diversity. Genome Biol 8, R238.PubMedCrossRefGoogle Scholar
- 29.Zhang, Y., Eberhard, D. A., Frantz, G. D., Dowd, P., Wu, T. D., Zhou, Y., Watanabe, C., Luoh, S. M., Polakis, P., Hillan, K. J., Wood, W. I., and Zhang, Z. (2004) GEPIS – quantitative gene expression profiling in normal and cancer tissues Bioinformatics 20, 2390–8.PubMedCrossRefGoogle Scholar
- 30.Ferguson, D. A., Chiang, J. T., Richardson, J. A., and Graff, J. (2005) eXPRESSION: an in silico tool to predict patterns of gene expression. Gene Expr Patterns 5, 619–28.PubMedCrossRefGoogle Scholar
- 31.Stanton, J. A., Macgregor, A. B., and Green, D. P. (2003) Identifying tissue-enriched gene expression in mouse tissues using the NIH UniGene database. Appl Bioinformatics 2, S65–73.PubMedGoogle Scholar
- 32.Ramsey, J. S., Wilson, A. C., de Vos, M., Sun, Q., Tamborindeguy, C., Winfield, A., Malloch, G., Smith, D. M., Fenton, B., Gray, S. M., and Jander, G. (2007) Genomic resources for Myzus persicae: EST sequencing, SNP identification, and microarray design. BMC Genomics 8, 423.PubMedCrossRefGoogle Scholar
- 33.Gracey, A. Y., Fraser, E. J., Li, W., Fang, Y., Taylor, R. R., Rogers, J., Brass, A., and Cossins, A. R. (2004) Coping with cold: an integrative, multitissue analysis of the transcriptome of a poikilothermic vertebrate. Proc Natl Acad Sci USA 101, 16970–5.PubMedCrossRefGoogle Scholar
- 34.Owen, J., Hedley, B., Svendsen, C., Wren, J., Jonker, M. J., Hankard, P. K., Lister, L. J., Stu¨rzenbaum, S. R., Morgan, A. J., Spurgeon, D. J., Blaxter, M. L., and Kille, P. (2008) Transcriptome profiling of developmental and xenobiotic responses in a keystone soil animal, the oligochaete annelid Lumbricus rubellus. BMC Genomics. 9, 266.Google Scholar
- 35.Edwards, N. J. (2007) Novel peptide identification from tandem mass spectra using ESTs and sequence database compression. Mol Syst Biol 3, 102.PubMedGoogle Scholar
- 36.Gupta, S., Zink, D., Korn, B., Vingron, M., and Haas, S. A. (2004) Genome wide identification and classification of alternative splicing based on EST data. Bioinformatics 20, 2579–85.PubMedCrossRefGoogle Scholar
- 37.Ner-Gaon, H., Leviatan, N., Rubin, E., and Fluhr, R. (2007) Comparative cross-species alternative splicing in plants. Plant Physiol 144, 1632–41.PubMedCrossRefGoogle Scholar
- 38.Panitz, F., Stengaard, H., Hornshoj, H., Gorodkin, J., Hedegaard, J., Cirera, S., Thomsen, B., Madsen, L. B., Hoj, A., Vingborg, R. K., Zahn, B., Wang, X., Wang, X., Wernersson, R., Jorgensen, C. B., Scheibye-Knudsen, K., Arvin, T., Lumholdt, S., Sawera, M., Green, T., Nielsen, B. J., Havgaard, J. H., Brunak, S., Fredholm, M., and Bendixen, C. (2007) SNP mining porcine ESTs with MAVIANT, a novel tool for SNP evaluation and annotation. Bioinformatics 23, i387–91.PubMedCrossRefGoogle Scholar
- 39.Tang, J., Vosman, B., Voorrips, R. E., van der Linden, C. G., and Leunissen, J. A. (2006) QualitySNP: a pipeline for detecting single nucleotide polymorphisms and insertions/deletions in EST data from diploid and polyploid species. BMC Bioinformatics 7, 438.PubMedCrossRefGoogle Scholar
- 40.Huntley, D., Baldo, A., Johri, S., and Sergot, M. (2006) SEAN: SNP prediction and display program utilizing EST sequence clusters. Bioinformatics 22, 495–6.PubMedCrossRefGoogle Scholar